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Preface 



For the past 25 years the CADE conference has been the major forum for the 
presentation of new results in automated deduction. This volume contains the 
papers and system descriptions selected for the 17th International Conference 
on Automated Deduction, CADE-17, held June 17-20, 2000, at Carnegie Mellon 
University, Pittsburgh, Pennsylvania (USA). 

Fifty-three research papers and twenty system descriptions were submitted 
by researchers from fifteen countries. Each submission was reviewed by at least 
three reviewers. Twenty- four research papers and fifteen system descriptions 
were accepted. The accepted papers cover a variety of topics related to the- 
orem proving and its applications such as proof carrying code, cryptographic 
protocol verification, model checking, cooperating decision procedures, program 
verification, and resolution theorem proving. 

The program also included three invited lectures: “High-level verification 
using theorem proving and formalized mathematics” by John Harrison, “Scal- 
able Knowledge Representation and Reasoning Systems” by Henry Kautz, and 
“Connecting Bits with Floating-Point Numbers: Model Checking and Theorem 
Proving in Practice” by Carl Seger. Abstracts or full papers of these talks are 
included in this volume. In addition to the accepted papers, system descriptions, 
and invited talks, this volume contains one page summaries of four tutorials and 
five workshops held in conjunction with CADE-17. 

The CADE-17 ATP System Competition (CASC-17), held in conjunction 
with CADE-17, selected a winning system in each of four different automated 
theorem proving divisions. The competition was organized by Geoff Sutcliffe and 
Christian Suttner and was overseen by a panel consisting of Claude Kirchner, 
Don Loveland, and Jeff Pelletier. This was the fifth such competition held in 
conjunction with CADE. Since the contest was held during the conference the 
winners were unknown as of this printing and the results are not described here. 

I would like to thank the members of the program committee and all the 
referees for their care and time in selecting the submitted papers. I would also 
like to give a special thanks to Bill McCune for setting up and maintaining the 
web site for the electronic program committee meeting. 



April 2000 



David Me Allester 
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High-Level Verification Using Theorem Proving 
and Formalized Mathematics 
(Extended Abstract) 



John Harrison 

Intel Corporation, EY2-03 
5200 NE Elam Young Parkway 
Hillsboro, OR 97124, USA 
j ohnhSichips . Intel . com 



Abstract. Quite concrete problems in verification can throw up the 
need for a nontrivial body of formalized mathematics and draw on several 
special automated proof methods which can be soundly integrated into a 
general LCF-style theorem prover. We emphasize this point based on our 
own work on the formal verification in the HOL Light theorem prover of 
floating point algorithms. 



1 Formalized Mathematics in Verification 

Much of our PhD research [11] was devoted to developing formalized mathe- 
matics, in particular real analysis, with a view to its practical application in 
verification, and our current work in formally verifying floating point algorithms 
shows that this direction of research is quite justified. 

First of all, it almost goes without saying that some basic facts about real 
numbers are useful. Admittedly, floating point verification has been successfully 
done in systems that do not support real numbers at all [16,17,19]. After all, float- 
ing point numbers in conventional formats are all rational (with denominators 
always a power of 2). Nevertheless, the whole point of floating point numbers is 
that they are approximations to reals, and the main standard governing floating 
point correctness [13] defines behavior in terms of real numbers. Without using 
real numbers it is already necessary to specify the square root function in an 
unnatural way, and for more complicated functions such as sin it seems hardly 
feasible to make good progress in specification or verification without using real 
numbers explicitly. 

In fact, one needs a lot more than simple algebraic properties of the reals. 
Even to define the common transcendental functions and derive useful proper- 
ties of them requires a reasonable body of analytical results about limits, power 
series, derivatives etc. In short, one needs a formalized version of a lot of elemen- 
tary real analysis, an unusual mixture of the general and the special. A typical 
general result that is useful in verification is the following: 

If a function / is differentiable with derivative f in an interval [a, 6], 
then a sufficient condition for j{x) < K throughout the interval is that 
/(a;) < K &i the endpoints a, b and at all points of zero derivative. 
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This theorem is used, for example, in finding a bound for the error incurred 
in approximating a transcendental function by a truncated power series. The 
formal HOL version of this theorem looks like this: 



1- (!x. 


a <= 


X /\ X <= 


b ==> (f 


diffl 


(f> x)) x) A 


f (a) 


<= K 


/\ 








f (b) 


<= K 


/\ 








(!x. 


a <= 


X /\ X <= 


b /\ (f’ 


(x) = 


&0) ==> f(x) <= K) 


==> 


(!x. 1 


a <= X /\ 


X <= b == 


> f(x) 


<= K) 



A typical concrete result is a series expansion for tt [1]: 






16”V8n+l 8n + 4 8n + 5 8n + 6 



This allows us to approximate tt arbitrarily closely by rational numbers. 
Doing so is important both for detailed analysis of trigonometric range reduction 
(reducing an argument a; to a trigonometric function to r where x = r + Nn/2) 
and to dispose of trivial side-conditions. For example, an algorithm might rely 
on the fact that sin{x) is positive for some particular x, and we can verify this 
by confirming that 0 < a; < tt using an approximation of tt. In HOL, the formal 
theorem is as follows: 



|- (\n. inv(&16 pow n) * (&4 / &(8 * n + 1) - &2 / &(8 * n + 4) - 

&1 / &(8 * n + 5) - &1 / &(8 * n + 6))) sums pi 



The mathematics needed in floating-point verification is an unusual mixture 
of these general and special facts, and it’s sometimes the kind that isn’t widely 
found in textbooks. For example, an important result we use is the power series 
expansion for the cotangent function (for a; yf 0): 



^ 1 1 1 3 2 5 

cot(a; = - - -a; - — a;^ - — a;^ - 
a; 3 45 945 



To derive this straightforward-looking theorem, both getting a simple recur- 
rence relation for the coefficients and a reasonably sharp bound on their size, 
is fairly non-trivial. A typical mathematics book either doesn’t mention such a 
concrete result at all, or gives it without proof as part of a “cookbook” of well- 
known useful results. After some time browsing in a library, we eventually settled 
on formalizing a proof in Knopp’s classic book on infinite series [14]. Formalizing 
this took several days of work, drawing extensively on existing analytical lemmas 
in HOL. A side-effect is that we derived a general result on harmonic sums, the 
simplest special cases of which are the well-known: 



1 -k 1/2^ -y 1/3^ -y 1/4^ -y • • • = 7 tV6 

and 



1 -k 1/2^ -k 1/3^ -k 1/4^ -k • • • = 7r^/90 
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Knopp remarks 

It is not superfluous to realize all that was needed to obtain even the 

first of these elegant formulae. 

We may add that it is even more surprising that such extensive mathematical 
developments are used simply to verify that a floating point tangent function sat- 
isfies a certain error bound. Of course, one also needs plenty of specialized facts 
about floating point arithmetic, e.g. important properties of rounding. These 
theories have also been developed in HOL Light [12] but we will not go into 
more detail here. 

2 Proof in HOL Light 

The theorem prover we are using in our work is HOL Light [8],^ a version of the 
HOL prover [5] . HOL is a descendent of Edinburgh LCF [6] which first defined the 
‘LCF approach’ that these systems take to formal proof. LCF provers explicitly 
generate proofs in terms of extremely low-level primitive inferences, in order to 
provide a high level of assurance that the proofs are valid. In HOL Light, as 
in most other LCF-style provers, the proofs (which can be very large) are not 
usually stored permanently, but the strict reduction to primitive inferences in 
maintained by the abstract type system of the interaction and implementation 
language, which for HOL Light is CAML Light [4,23]. The primitive inference 
rules of HOL Light, which implements a simply typed classical higher order logic, 
are very simple, and will be summarized below. 



r \- s = t A \- 1 = V 
r u A\- s = u 

r \- s = t A\- u = V 
T U Z\ h s(u) = t{v) 

rh s = t 



TRANS 



MK.COMB 



r h {Xx. s) = (Xx. t) 

— ^ BETA 

h (Aa;. t)x = t 

ASSUME 

iP\ ^ P 

r \- p = q A\~ p 



ABS 



r\jA\~q 



EQ_MP 



^ See http://www.cl.cam.ac.uk/users/jrh/hol-light/index.html 
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r p Z\ h g 

{r - {g}) U (Z\ - {p}) \-p=q 



DEDUCT_ANT I S YMJIULE 



r[xi, ...,Xn] I- pjxi, ■■■,Xn] 
1 , . . . , 1 ~ , . . . , 



r[ai , . . . , a„] h p[ai , . . . , a„] 



INST.TYPE 



In MK_C0MB, the types must agree, e.g. s : a ^ t, t : a ^ t, u : a and v : a. 
In ABS, we require that x is not a free variable in any of the assumptions F. In 
ASSUME, p must be of Boolean type, i.e. a proposition. 

All theorems in HOL are deduced using just the above rules, starting from 
three axioms: Extensionality, Choice and Infinity. There are also definitional 
mechanisms allowing the introduction of new constants and types, but these are 
easily seen to be logically conservative and thus avoidable in principle. 

CAML Light also serves as a programming medium allowing higher-level 
derived rules (e.g. to automate linear arithmetic, first order logic or reasoning in 
other special domains) to be programmed as reductions to primitive inferences, 
so that proofs can be partially automated. This is very useful in practice. In 
floating point proofs we make extensive use of quite intricate facts of linear 
arithmetic, such as: 



I - X <= a /\ y <= b /\ 

abs(x - y) < abs(x - a) /\ abs(x - y) < abs(x - b) /\ 
(x <= b ==> abs(x - a) <= abs(x - b) ) /\ 

(y <= a ==> abs(y - b) <= abs(y -a)) 

==> (a = b) 



Proving these by low-level primitive inferences can be tedious in the extreme, 
so it is immensely valuable to have the process automated. Similarly, we often 
use first order automation to avoid tedious low-level reasoning (e.g. chaining 
together many inequalities) or exploit symmetries via lemmas such as: 



I- (!x y. P X y = P y x) /\ 

(!x y. Q X ==> P X y) 

==> !x y. Q X \/ Q y ==> P x y 



Because these are all programmed as reductions to primitive inferences, we 
have the security of knowing that any errors in the derived rule cannot result 
in false “theorems” as long as the few primitive rules are sound. This can be 
especially important in verification of real industrial systems, since an error in 
a ‘proof’ can invalidate the entire result. 

The basic LCF approach of exploiting traditional automated techniques [3,15] 
or high-level methods of proof description [9] by reducing them to primitive 
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inferences in a single core logic seems to us a very fruitful one. Of course, it has 
an efficiency penalty, but as we argue in [7], it is not usually too severe except 
in a few special cases. Nevertheless, there is still much more work to be done to 
make systems like HOL Light really usable by a nonspecialist. In our opinion, the 
most impressive system for formalizing abstract mathematics is Mizar [18,22], 
and importing the strengths of that system into LCF-style provers is a popular 
topic of research [10,21,24,26]. 

The first sustained attempt to actually formalize a body of mathematics 
(concepts and proofs) was Principia Mathematica [25] . This successfully derived 
a body of fundamental mathematics from a small logical system. However, the 
task of doing so was extraordinarily painstaking, and indeed Russell [20] re- 
marked that his own intellect ‘never quite recovered from the strain of writing 
it’. The correctness theorems we are producing in our work often involve tens or 
hundreds of millions of applications of primitive inference rules, and build from 
foundational results about the natural numbers up to nontrivial and highly con- 
crete applied mathematics. Yet using HOL Light, which can bridge the abyss 
between simple primitive inferences and the demands of real applications, doing 
so is quite feasible. 
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Abstract. Proof-carrying code and other applications in computer secu- 
rity require machine-checkable proofs of properties of machine-language 
programs. These in turn require axioms about the opcode/operand en- 
coding of machine instructions and the semantics of the encoded in- 
structions. We show how to specify instruction encodings and semantics 
in higher-order logic, in a way that preserves the factoring of similar 
instructions in real machine architectures. We show how to automat- 
ically generate proofs of instruction decodings, global invariants from 
local invariants, Floyd-Hoare rules and predicate transformers, all from 
the specification of the instruction semantics. Our work is implemented 
in ML and Twelf, and all the theorems are checked in Twelf. 



1 Introduction 

The security problem for mobile code or for component software is this: an 
untrusted program (or program fragment) is to execute in a host environment 
(the code consumer), and we want to ensure that it will do no harm. Proof 
Carrying Code (PCC) [1] is a framework for solving this problem by providing 
such assurances to the host. In the PCC framework the code consumer advertises 
a safety policy which specifies the logic in which it will accept proofs, the regions 
of readable or writable addresses, and so on. The code producer must construct 
a proof that the machine-language program satisfies the safety policy; the proof 
might be generated using hints from the compiler that generated the code. This 
proof along with the code is communicated to the host environment and the host 
verifies it before executing the code. PCC has significant advantages over other 
approaches that address the same problem (such as software fault isolation [6] 
or byte code interpretation [7]): no performance penalty is taken since the code 
is run at native speeds, and the proofs are performed on native machine code 
so no unsoundness can be introduced in the translation (or compilation) from 
the proved program to the one that will actually execute. For well-chosen safety 
policies, the proofs can be generated completely automatically. 

In Appel and Felty [5] we gave an overview of our PCC system and described 
how it differs from the approach taken by Necula [2] . Instead of building type- 
inference rules into the safety policy, we model types as defined predicates using 
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the primitives of ordinary logic; we prove typing rules as lemmas, and show how 
to model a wide variety of type constructors. This way the PCC safety policy 
is independent of the code producer’s programming language and type system. 
The machine description semantics are moved from the verification-condition 
generator to the safety policy. More specifically our safety policy consists of the 
following: 

1. The logic: a fairly standard higher order logic^ (£) consisting of eight infer- 
ence rules for the logic and twenty-nine for arithmetic (with addition and 
multiplication taken as primitives). 

2. The machine code syntax and semantics: this is encoded as the definition of 
the step relation (i— >) that describes the syntax and semantics of the ma- 
chine. Step formally captures the notion of a single instruction execution. 
These axioms also define the decode relation that completely specifies in- 
struction opcodes and operands (machine syntax) for all legal machine-code 
instructions. 

3. Safety constraints: these are statements^ in C that describe general prop- 
erties of the runtime system (such as readable and safe-to-jump memory 
locations). They may also contain typing judgments for the initial contents 
of the register bank. 

The small size of the logic is one of the major advantages of our approach. 
It contains no inference rules on types and no Hoare-logic rules for instructions 
(thus avoiding all complications due to substitution). Since it is so small, the 
proof checker can be likewise small. Thus the trusted computing base (TCB) 
can be verified easily (either by hand or through other means). A small TCB is 
the essence of PCC. 

To simplify the presentation of the following sections we will use the toy 
machine (from [5]), a word- addressed 16-bit CPU. Its instruction set is presented 
in figure 1. Our system currently works with two other machine architectures 
(Sparc and Mips) and when appropriate we will also use examples from these. 

2 Overview 

Our focus in this paper is twofold: concise axioms modeling machine architec- 
tures, and efficient proofs using those axioms. 

^ Our logic C, is a sublogic of the Calculus of Constructions [11] and of the logic used 
in the HOL theorem prover [12], so our proofs can be checked in either Coq or HOL. 
Our current implementation uses Twelf [4]. 

^ We offer a brief introduction to the syntax of our object logic: A metalogic (Twelf) 
type is a type, and an object-logic type is a tp. Object-logic types are constructed 
from num (the type of rationals), form (the type of formulas) and the arrow construc- 
tor. Object-level terms of type T have type (tm T) in the metalogic. Terms of type 
(pf A) are terms representing proofs of object formula A. The term lam[x]U(®) is 
the object-logic function that maps x to F{x) and @ is the application operator for 
A-terms. See Appel and Felty [5] for more details. 
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Instruction 


Fields 


Effect 


add 


0 d si s2 


Td ■■= Tsl -1- rs2 


addi 


1 d a c 


Td ■= Ts + sign_ext(c) 


load 


2 d s c 


rd ■■= m[rs -I- sign_ext(c)] 


store 


3 si s2 c 


m[rs 2 + sign_ext(c)] := Tsi 


jump 


4 d s c 


Td := Tpc ; Tpc := Ta -|- sign_ext(c) 


bgt 


5 si s2 c 


if r^i > Va 2 then rpc := Vpc -f sign_ext(c) 


beq 


6 si s2 c 


if r^i = rs 2 then r^c := Vpc -f sign_ext(c) 



Fig. 1. The toy machine instruction set. 



We will describe in detail our step relation and show how it succinctly cap- 
tures the syntax and semantics of real machines. Since it is by far the largest 
piece of our safety policy we are of course concerned about its correctness. To 
this end we will show how parts of it can be automatically generated from ex- 
isting systems. Here we tackle the syntax of machine instructions using machine 
descriptions from the New Jersey Machine Code Toolkit [8]. We also show how to 
automatically generate proofs of correspondence between machine code integers 
and statements involving the decode relation. 

We will describe the engineering aspects of generating small proofs of safety. 
Program safety is proved using a coinduction theorem based on progress and 
preservation of an invariant. We construct invariant expressions whose size is 
linear in the number of program instructions, and structure the progress and 
preservation proofs so that - modulo the parts that will have to be built by our 
tactical theorem prover - they are linear in size. In building these invariants we 
need to use the weakest preconditions of instructions and we will show how to 
automatically generate lemmas for a Hoare logic of machine language from the 
step relation. Our safety proofs will be linear-sized trees of applications of these 
Hoare lemmas. 

Figure 2 shows our system operating on a small program that computes the 
sum of a linked list of integers. The goal of the system is to prove that the initial 
machine configuration (IMC) is safe, in symbols the following theorem: 

IMC(ro,mo) safe(ro,mo) 



where 



IMC(r, m) := m(lOO) = 8976 A • • • A m(105) = 24859 
saf e(r, m) := VF, m' (r, m r', m') —>■ 3r”,m” {r\ m! i-^- r”,m”). 

The IMC describes parts of memory at the moment the program will run (in this 
case only the part containing the program itself). The step relation r', m! 

formally describes a single instruction execution, i.e. given a machine at state 
{r,m), after execution of the instruction found at r(pc), the machine will be at 
state {r\m!). The safe property states that no matter how far the execution 
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Machine 

Code C Predicates 




CInvi CInvi 



Fig. 2. Generating safety proofs. 

proceeds, it never gets stuck, i.e. executes an illegal instruction or performs an 
illegal fetch. 

The PCC system is presented with a list of machine code instructions (i.e. 
integers). The instruction stream is fed through the decode-prover whose job 
is to discover the instruction each integer represents, and to produce the sym- 
bolic representation of each instruction - which is a predicate that describes the 
instruction’s semantics. The decode-prover also produces proofs of this corre- 
spondence. Following this, the predicates are fed into the invariant-generator 
which builds the global invariant to be used in the coinduction proof. Construct- 
ing invariants is not computable in general, so the prover requires hints in the 
form of local loop invariants decorating the targets of backward branches. Once 
the global invariant is built we must prove the three preconditions of the coin- 
duction theorem^ (see figure 3) in order to apply it. This is done by the prover 
and given the three proofs we apply the rule to finally establish saf e(ro, toq). 



progress(lnv) := Vr, m Inv(r, m) ^ 3 r( m! (r, m r( m') 
preservation(lnv) := Vr, m, r, m' Inv(r, m) /\ {r,m r\ m!) Inv(r( m!) 

Inv(r, m) progress(lnv) preservation(lnv) 
saf e(r, m) 



Fig. 3. The coinduction theorem. 



® A note on notation: in the interest of brevity we will sometimes use mathematical 
notation when presenting Twelf terms. 
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upd(f,d,x,f') 


■.= if {z = d) then f'{z) = x else f'{z) = f(z) 


i_add(d, si , S 2 )(r, m, r, m') 


:= 3sum plus_modl6(r(si), r{s 2 ), sum) A 




upd(r, d, sum, r') A no_mem_change(m, m') 


i_load(d, si, c)(r, m, r, m') 


:= 3 cert, addr 




sign_ext(3, c, cext) A 
plus_modl6(r(sl), cext, addr) A 
upd(r, d, m{addr), r') A 
readable(addr) A no_mem_change(m, m') 



Fig. 4. Semantics of the add and load instruction of the toy machine. 



3 Machine Semantics 

In this section we show that the semantics of machine instructions can be easily 
and concisely expressed in higher order logic. We begin by explaining the idea 
using the toy machine, and then explore the problems in defining a semantic 
description of a real CPU. 

Each instruction defines a relation between the machine state (registers, 
memory) before and after its execution. We treat both the memory and reg- 
ister bank as functions from integers to integers. Each instruction then becomes 
a predicate which takes (r, m, m') as input, and holds when the instruction 
can safely take state (r, m) to (r', m'). In figure 4 we show the terms expressing 
the semantics for the “add” and “load” instructions of the toy machine. The 
Twelf term i_add (what we will call a constructor in section 4) expects three 
arguments (d, si, S 2 ) and returns a predicate of type instr, defined as: 

instr = regs — > mem ^ regs mem —> form. 

It is this predicate that we view as the semantics of the instruction. Thus for the 
add instruction, i_add(d, si, S 2 ) holds when for some integer sum, the following 
three equations hold: 

sum = (r(si) -I- r{s 2 )) mod 2^® 

\/x if (x = d) then r'{x) = sum else r'{x) = r(x) 

Vx m(x) = m'(x). 

The situation is similar for the semantics of the “load” instruction. But we 
wish to consider a program safe only if all of its memory accesses are within a 
specified region. Therefore our step relation admits only a subset of executable 
load instructions: those that load from readable addresses. The designer of the 
safety policy must provide axioms that define the readable predicate. In general 
the semantics of each instruction must enforce the proper conditions under which 
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the instruction can be executed. For “add” there are no such conditions; we can 
always add two numbers. 

Real hardware can be a lot more complex than our simplistic toy machine. 
On a modern CPU one has to deal with the issues of delayed branches, address 
alignment, stores and loads of different sizes, condition registers, sign extension, 
instructions with multiple effects, and ALU operations not directly expressible 
in our arithmetic, to mention just a few. We claim that all of these can be 
handled relatively easily with the right set of abstractions and definitions. Space 
restrictions only allow us to deal with a representative subset here. We will use 
the Sparc CPU in the presentation. 



— Condition Registers: We model condition registers exactly as we model phys- 
ical registers. We assign a number to each of them that is outside the range 
of representable register numbers and refer to them exactly the same way 
we refer to regular registers. Instructions that need to modify individual bits 
do so by the use of appropriate definitions (see the bits predicate below). 

— Delayed Branches: In order to keep their deep pipelines filled, some modern 
CPUs have introduced the notion of a delayed branch. On such CPUs one 
(or more) of the instructions following a branch will be executed even if the 
branch is taken, before the CPU starts executing instructions from the target 
address. We will assume a single instruction delay slot (the solution can be 
easily generalized to a delay slot of n instructions). We introduce another 
register called the next program counter^ (upc) which holds the address at 
which the pc will be next. In the semantics of a branch instruction, if the 
branch is to be taken we simply set r(npc) = target and the step relation 
takes care of updating r(pc) to r(npc) at the appropriate time. 

— Address Alignment: Machine addresses have to be properly aligned depend- 
ing on the instruction that uses them. Using the bits(r, I, v, w) predicate 
(which holds when the value in the binary representation of w between bits 
r and I equals v, in symbols bits(r, l^v,w) ^ v = mod 2”“^+^) we can 
easily express such constraints. In a load- word instruction for the Sparc for 
instance we would insist that bits(0, 1, 0, address) holds. 

— Stores/Loads: We chose to model memory by a function m : num — > imm that 
we define only on word-aligned addresses. This way we avoid the complica- 
tions of modifying individual bytes in a word. When we wish to store a byte 
quantity, the entire word must be fetched from memory, the byte spliced 
into it, and then stored back in memory. For load we have a similar situa- 
tion. With the appropriate definitions all these operations can be specified 
painlessly. With careful selection of predicates most of them can be shared 
between the load and store instructions. One such example is the predicate 
f orm_address below. It computes a word aligned address and offset from an 
unaligned one, and ensures that the original address was well aligned with 

^ This is in fact how the hardware manages delayed branches. Some machines make 
the npc register explicit in the specifications [10]. 
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respect to the size of the value we are trying to load/store. 

f orm_address(w_od(ir, alignment -bit , addr, ojfset, size) := 
bits(0, alignment -hit, offset, U-addr) A 
minus_[nod32(M_a(idr, offset, addr) A modulo{offset, size, 0) 

— Arithmetic Operations: Some of the arithmetic operations performed by 
modern CPUs are not directly expressible as functions in our logic. We can- 
not, for example, write the function that computes the bitwise “exclusive 
or” of two integers since our arithmetic primitives include only addition and 
multiplication and we have no recursion at the object level. Such operations 
are however, trivially expressed as relations (predicates). Here is for instance 
the xor predicate: 

xor(a, b, c) = Vi 3a;, y, r bits(i, i, x, a) A bits(z, i, y, h) A 

(if X = y then r = 0 else r = 1) A bits(z, i, r, c) 

Factoring via Higher-order Predicates. Machine instruction sets are highly fac- 
tored, both in syntax and semantics. Consider for instance the ALU operations 
of any modern RISC chip. The ALU takes its input from two registers (or a 
register and a constant) and produces the result in another. The only difference 
between instructions is the operation performed. Our use of higher order logic 
allows us to exploit such factoring very effectively. We find the commonalities in 
families of instructions (even between families as in the load/store case above), 
factor those out and reuse well-chosen definitions. Here is an example from the 
Sparc. The definition of i_aluxcc is reused to define 23 different instructions. 
Argument with_carry specifies whether the instruction operates with a “carry”, 
modif ies_icc specifies whether it modifies the integer condition codes, and func 
is the predicate describing the operation performed by the instruction. 

alu_fun = num arrow mm arrow num arrow form. 

i_aluxcc : tm (form arrow form arrow alu_fun arrow alu_typ) = 
lam3 [with_carry : tm form] [modif ies_icc : tm form] [func : tm alu_fun] 
lam3 [rsl] [reg_imm] [rd] 
lam4 [r] [m] [r’] [m’] 

(exists3 [v] [v’] [r’ ’] 

(load_reg_imm @ r @ reg_imm @ v) and 

(compute_with_carry <S with_carry @ func @ r @ rsl @ v @ v’) and 
(compute_cc @ modifies_icc @ r @ r’’ @ v’) and 
(upd_reg @r’’ @rd@v’ @r’) and 
(no_memory_change m m’)). 

i_AND = i_aluxcc <S false ® false @ and_oper . 

i_ANDcc = i_aluxcc @ false @ true @ and_oper . 

" " " " " " — 21 cases omitted. 

Moreover we exploit commonality between machines. Many of our definitions 
that deal with the mechanics of splicing values into words, sign extension, and 




14 



Neophytos G. Michael and Andrew W. Appel 



arithmetic operations, are shared between semantic descriptions of different ma- 
chines. Higher-order predicates are useful in expressing this kind of sharing; note 
that the i_aluxcc predicate above is higher order. 



4 The Decode Relation 



On a von Neumann machine, each instruction is represented in memory by an 
integer. The decode relation makes this notion precise. It is a predicate of four 
arguments (m, w, i, s) stating that address w in memory m contains the encod- 
ing of instruction i that has size s. Modern microprocessors have hundreds of 
instructions and to construct this relation manually would be a daunting task. 
The observation that the information we wish to encode is very similar to the 
information used by an assembler/disassembler led us to look for an automatic 
way to generate the relation. 

The New Jersey Machine Code Toolkit [8] helps programmers write appli- 
cations that process machine code - assemblers, disassemblers, code generators, 
and so on. The toolkit lets programmers encode and decode machine instruc- 
tions symbolically. It transforms symbolic manipulations into bit manipulations, 
guided by a specification that defines mappings between symbolic and binary rep- 
resentations of instructions. Of interest to us here is the specification language 
(called SLED) for encoding and decoding assembly-language representations of 
machine instructions [9]. It is a concise, elegant, and semantically well-founded 
language, a fact that has made the translation into logic fairly painless. In fact 
our translation into C can be viewed as a semantics for the language. 

Before describing our encoding of SLED into C we offer a brief introduction 
to the language. In order to accommodate machines with non-uniform instruc- 
tion sizes the toolkit works with streams of tokens instead of instructions. Each 
instruction consists of one or more tokens. Tokens are further partitioned into 
fields which are sequences of contiguous bits within a token. Patterns in SLED 
serve two purposes: firstly they are used to constrain the division of streams into 
tokens, and secondly to constrain the values of fields in those tokens. Patterns 
can be combined with various operators to produce new patterns. The toolkit is 
concerned with two representations of machine instructions: machine code and 
assembly language. Constructors are used to connect the two representations. 

Figure 5 presents a SLED specification of the toy machine architecture. The 
first two lines specify the 16-bit token instr and its fields: op which occupies 
bits 12 to 15, rd which occupies bits 8 to 11, and so on. The next line specifies a 
list of patterns (add, . . . , beq,) and for each one, it constrains the op field to have 
the value 0, . . . , 6 respectively. Finally the constructors clause specifies the 
toy machine instructions. A special toolkit shortcut is used here: if no pattern is 
specified in the constructor definition then all the names used in the constructor 
must be either patterns or fields and their conjunction is taken to be the pattern 
that will be generated by the constructor. In the next subsections we show how 
to map fields, patterns, and constructors into higher-order logic. 
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fields of instr (16) 










op 12:15 rd 8:11 


rsl 


4:7 


rs2 0 : 3 


c 0:3 


patterns [add addi load 


store 


jump bgt 


beq] isop = 0to6 


constructors add 


rd, 


rsl , 


rs2 




addi 


rd, 


rsl , 


c 




load 


rd. 


rsl , 


c 




store 


rd. 


rsl , 


c 




jump 


rd. 


rsl , 


c 




bgt 


rd. 


rsl , 


c 




beq 


rd. 


rsl , 


c 





Fig. 5. The SLED specification for the toy machine. 



4.1 Mapping Fields into C 

The definition of the bits predicate (from section 3) makes it straightforward 
to map fields into L. All that it takes is to supply the right and left bit specifiers 
of each field to this predicate. Since our definitions are curried, defining fields 
in L becomes very convenient and almost as terse as it is in SLED. For the toy 
machine the first two fields are translated as follows: 

op = bits @ (const 12) ® (const 15) . 
rd = bits @ (const 8) @ (const 11) . 

The op predicate expects two integers as arguments (v, word), and it holds when 
V is equal to the integer between the 12th and 15th bit of word. 



4.2 Mapping Patterns into C 

Patterns in SLED constrain both the division of streams into tokens and the 
values of the fields in those tokens. They are composed of constraints on fields. 
Patterns can be combined using various operators to form other patterns. The 
RISC machine descriptions we have considered so far contain only conjunction 
and disjunction operators, and those are the ones we currently translate. We 
expect no problems in translating the rest when we choose to deal with CISC 
machines. Conjunction is used to constrain multiple fields within a single token. 
When p and q are patterns, the pattern “p & q” matches if both p and q match. 
For example, in the SLED description for Sparc [8] we find:® 

® This is another example of the terseness of SLED. In the definitions of these patterns 
Ramsey [9] makes use of a SLED feature called generating expressions, which describe 
ranges of lists either explicitly or implicitly as shown in the example. 
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patterns 

[ TABLE_F2 CALL TABLE_F3 TABLE_F4 ] is op = {O to 3}- 
[ UNIMP Bicc SETHI FBfcc CBccc ] is TABLE_F2 & op2 = [0 2 4 6 7] 

NOP is SETHI & rd = 0 & imm22 = 0 

In the first line TABLE_F2 is defined as the pattern that wants the op field to 
equal zero, in the second line TABLE_F2 is used in the definition of SETHI which 
is defined as the conjunction of patterns TABLE_F2 and op2 = 4. Finally in the 
last line pattern SETHI is used in the definition of the NOP pattern.® Patterns of 
this kind are very easy to translate into C. We make use of a higher level infix 
“and” operator defined as: 

num_pred = num arrow form. 

kk : tm num_pred -> tm num_pred -> tm num_pred = 

[pi] [p2] lam [w] (pi @ w) and (p2 @ w) . 

Given && it is now easy to deal with conjunctive patterns by simply “anding” 
together the different conjuncts after mapping each of them to an C predicate. 
The example above then becomes: 

p_TABLE_F2 = op @ (const 0) . 

p_SETHI = p_TABLE_F2 kk (op2 @ (const 4)). 

p_N0P = p_SETHI kk (rd @ (const 0)) kk (imm22 @ (const 0)). 

Disjunction in patterns is usually used to group patterns for related instruc- 
tions. In the following example from the Sparc SLED we use disjunction to group 
the logical, shift, and arithmetic instructions into three groups, which are then 
disjunctively combined into a pattern that matches any ALU instruction. 

patterns 

logical is AND I ANDcc I ANDN I ANDNcc I OR I ORcc I ORN I ORNcc I ... 

arith is ADD I ADDcc I ADDX I ADDXcc I TADDcc I TADDccTV I ... 

shift is SLL I SRL I SRA 

alu is logical I arith I shift 

Disjunction patterns are mostly used as opcodes to constructors and we show 
how we deal with them in the next subsection. 



4.3 Mapping Constructors into C 

A constructor maps a list of operands to a pattern which stands for the binary 
representation of an operand or an instruction. There are two kinds of construc- 
tors, typed and untyped. Typed constructors generate instruction operands and 
untyped constructors generate instructions. The following definition from the 
Sparc specification is an example of a typed constructor: 

constructors imode simml3! : reg_or_imm is i = 1 & simml3 
rmode rs2 : reg_or_imm is i = 0 & rs2 

® A NDP on the Sparc is a SETHI on ro with value 0, and since ro is hardwired to zero 
it has no effect. 
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Each line in the definition of a constructor specifies the opcode, the operands, 
the constructor type, and matching pattern. Usually the opcode is the construc- 
tor’s name (as in this case). Constructors generate disjoint sum types. In the 
above, imode : num — > reg_or_imm is the canonical injection from num into the 
reg_or_imm type - likewise for mode : num ^ reg_or_imm. The type is defined 
implicitly at first use. Each constructor is applicable when the pattern following 
the is keyword is satisfied. 

The above constructor definition captures the following idiom: many Sparc 
instructions (such as add rl, reg_or_imm, r2) take either a register or a constant 
as one of their arguments. The hardware differentiates between the two instances 
by the value of bit 13 (field i) in the representation of the instruction. Depending 
on the value of i, either imode or mode can be applied, giving in each case a 
reg_or_imm. 

We translate a typed constructor into C as follows. We first create a new 
object-logic type for the constructor type. For each of the injective arrows (imode 
and mode above) we create an injective Twelf term (c_imode and cjrmode), 
as well as a discriminator term (p_imode and pjrmode). Finally we generate a 
predicate that decides the type itself (pjreg_or_imm), i.e. a term that when given 
an object of that type and a word decides whether that word contains the given 
object. We show these terms for the example below: 

reg_or_imm : tp 

c_imode : num > reg_or_imm 

c_rmode : num > reg_or_imm 

p_imode(szmm) := i(l) && siinml3(szmm) 
p_rmode(s2) := i(0) && rs2(s2) 

p_reg_or_imm( regzmm, word) : = 

(3simm p_imode(szmm, word) A regimm = c_imode(szmm)) V 
(3s2 p_rmode(s2, word) A regimm = c_rmode(s2)) 

Untyped constructors represent the instructions themselves. Their transla- 
tion into C is not much different from the typed case so we omit it. 



Factoring via Higher-order Predicates. The extensive factoring present in the 
SLED specifications (through the wide use of “or” patterns) carries over to the 
translated higher-order logic terms. When translating a constructor that uses an 
“or” pattern as an opcode, we do not generate a unique term for each instruction 
but instead build just a single term that describes all of them. This way we 
preserve SLED’s economy of syntax. Here is an example for the ALU instructions 
of the Sparc shown earlier. The constructor in the spec is the following: 

constructors alu rsl, reg_or_imm, rd 
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p_instr( word, i) := (p_add 1 12 p_addi 1 12 P-load 1 12 P-Store 
p_jump j|2 P-bgt II2 p-beq)(word, i) 
decode(m, w, i, s) := (s = 1 ) A p_instr(m(w), i) 
step(r, m, r', m) := 3 i, r”, size decode(m, r{pc), i, size) A 

upd(r, pe, r{pe) + size, r”) A 

■ / H / !\ 

i[r ) 



Fig. 6. The decode and step relations for the toy machine. 



and we generate the following two terms for it: 

p_alu_aux(pj, i_cons, si, regimm, S2, word, i) := 

{p-i kk rsl(si) kk p_reg_or_imm(regzmm) kk rs 2 (s 2 ))(word) A 
i = i-Cons(si, regimm, S2) 

p_alu(w;or(i, i) := 3 si, rimm, S2 (p_alu_aux(p_AND, i_AND) jjs 

p_alu_aux(p_ANDcc, i_ANDcc) H5 

: : : : - 35 cases omitted 

p_alu_aux(p_SRA, i_SRA))(si , rimm, S2, word, i) 

where p_AND is the opcode pattern, i_AND is the instruction constructor and 
likewise for the rest of them. Here again we make use of a higher level “or” (H5) 
operator to factor out the common arguments to the auxiliary predicate. 

Our decode-generator is a 3200-line ML program that operates directly on 
SLED specifications. Since it generates a large portion of our safety policy it 
ought to be considered trusted code (along with the SLED specifications). We 
feel that this is a small enough program that can be thoroughly and convincingly 
debugged into correctness. Furthermore its output is human readable and only a 
constant factor bigger (between 2x and 3x) than the original SLED specification. 
Thus the output can easily be inspected and debugged directly. The program 
currently does not share any code with the New Jersey Machine Code Toolkit 
although the front-end code and some of the analysis that the two programs 
perform could be shared. We plan to investigate an integration of the two tools 
in the future. 



4.4 The Decode and Step Relations 

We are finally in a position to present the decode relation for the toy ma- 
chine (see figure 6). After all the instruction predicates have been emitted, the 
decode-generator creates a predicate for the top-level token (i.e. instr in the 
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case of the toy spec). This predicate is the disjunction of all the instruction pred- 
icates (modulo factoring as described above). Decode is then defined in terms of 
this predicate. Figure 6 also shows the step relation for the toy machine. It is a 
predicate mapping the machine state (r, m) (r( m!) by requiring the existence 
of an instruction i, a register bank r" and an integer size such that location 
r{pc) in memory m decodes to i, updating the register bank r with the next pc 
produces r" and finally instruction i safely maps (r" m) to (r( m!). Step models 
the meaning of a single instruction execution. 



5 Machine Code Proofs 

In this section we discuss some of the issues in generating the proofs used in the 
coinduction theorem (figure 3). 



5.1 Hoare-Logic Predicates for Local Invariants 

In the Floyd-Hoare logic one tries to establish statements of the form P {S'} Q, 
where S is a program statement, and P, Q are logical formulae. P {S} Q means 
that if P holds, and S executes to completion, then Q holds. The logic specifies 
a set of axioms and inference rules that allow the deduction of statements of 
this form. The assignment axiom for instance states: h P[E/V] {V'.=E} P . In 
our framework we have no such axioms or rules; nevertheless, our preservation 
statement (in figure 3) bears a striking resemblance to a Hoare judgment. What 
is stated there is in essence equivalent to: 

Inv{r,m) {{r,m) ^ {r' ,m')} Inv{r',m!) (1) 

i.e. if the invariant holds at (r, m), then it must hold at the new state (r', m!) at 
which we were taken by the execution of some instruction (a single step). This 
similarity is of course no accident; we wish to exploit the well understood theory 
of Hoare logic in order to construct the weakest preconditions that will allow us 
to prove preservation. 

Our invariant (as described in detail in previous work [5]) is in essence a 
disjunction of statements^ of the form r{pc) = n A decode(m, n, i, 1) A m) 
where i is the instruction found at m{n) and /„ is the local invariant at n. 
To make the situation more concrete assume that at r{pc) we find instruction 
add(ri, T 2 , ra) (ri := V 2 + r^), and that after completion of this instruction, we 

^ The invariant presented in Appel and Felty [5] could grow exponentially large for 
certain kinds of programs. By the use of appropriate higher-order definitions we 
have remedied this problem and now produce invariants that are always linear in 
the number of program instructions and in the size of the compiler-inserted loop 
invariants (see subsection 5.2). The structure of the new invariant is beyond the 
scope of this paper. The discussion in this section is equally applicable to either kind 
of invariant. 
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wish predicate Q{r, m) to hold at the new state. The question now is what should 
In be in order to be able to prove equation 1, or equivalently the statement: 

r{pc) = n A decode(m, n,i,l) A i = add(ri, r 2 , r^) A Ini’!", m) A 

{r,m 1 -^ r',m') ^ Q{r',m'). ' 

It is not difficult to see that one such /„ is the following: Q{r, m)[(r 2 + r 3 )/ri], 
i.e. the formula we get after applying the assignment axiom of Hoare logic to 
the postcondition Q(r,m). In building the invariant though, we do not wish to 
perform substitution of terms for two main reasons. Firstly, if we are not careful 
during substitution the local invariants could grow exponentially large.® The 
goal is to end up with small proofs of safety; an exponentially large theorem 
is unlikely to have a small proof. Secondly, our logic does not contain axioms 
that express term substitution; such axioms would render the proof checker 
more complex and would defeat our efforts for a small TCB. Instead we view 
substitution as a relation between terms and express the notion concisely by 
higher-order definitions. These definitions allow us to express In{f, w) in terms 
of Q{r, m) in such a way that the size of local invariants stays constant, and 
substitution is completely avoided (at this stage). We define predicate let.upd 
in terms of upd (introduced in figure 4) as follows: 

let_upd(r, a, V, /) := Vr' upd(r, a, v, r') f{r'). 

Predicate letjipd specifies that for any function r' that updates r at a with value 
u, Z(r') must hold (we note that there is exactly one such r'; upd is deterministic). 

Using this predicate we can succinctly express the weakest precondition for 
each of our instructions. Below we show the term for the add instruction; compare 
hx_add with the semantics of add shown in figure 4. 

hx_add(c?, si, S 2 , post){r, m) := 3sum plusjnodl6(r(si), r{s 2 ), sum) A 

let_upd(r, d, sum, \r' ,post{r' , m)) 

The last argument to hx_add is the postcondition, and the return value is a 
predicate on (r, m) expressing the weakest precondition for the add. Our sys- 
tem currently generates all the predicate transformers (such as hx_add above) 
automatically for each instruction from the step relation of each machine. The 
program performing the translation is not part of the TCB; if there is a bug in 
it then we will simply fail to prove preservation. 

In proving preservation we will have to prove a statement very similar to 
that in equation 2 for each instruction in our program (but see section 5.2). 
Such statements can be proved once and for all as lemmas and applied each 
time the corresponding instruction is encountered. The extensive use of such 

® Consider for example the program (r2 := ri -|- ri; rs := V2 -f r2; V4 := rs -f ra) with 
postcondition Q(c4). Its weakest precondition is Q(((ri - 1 - ri) -|- (ri -|- ri)) + ((ri -|- 
ci) -f (ri -I- ri))). The size of the argument to Q grows by a factor of two for each 
assignment. 
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lemmas will have a profound effect on the size of our safety proofs. We have 
currently proven such lemmas for all the instructions of the toy machine by 
hand. It is our intention to generate them and their proofs automatically from 
the step relation of each machine. 

5.2 Domain Specific Proofs 

Precondition strengthening (shown below) is another rule of Hoare logic. 

P'^P P{S}Q , . 

P'{S}Q 

It states that if P {S}Q then one may replace P by a stronger predicate. This 
scenario occurs when we deal with program loops, as we explain next. Safety 
proofs for programs with loops require the use of loop invariants. Construction 
of loop invariants is not computable in general, so our theorem prover requires 
hints in the form of typing judgments at every location that is the target of 
a backward jump. At such locations though, our invariant-generator would 
have computed a local invariant /„ (this is the weakest precondition of the in- 
struction - see subsection 5.1). We wish to replace /„ by (the typing hint 
at that location) as the precondition of that instruction, but in order to be able 
to do that we must establish that ^ After that, a lemma application 
similar to rule 3 allows us to conclude Hn{S} Q. We are building a tactical the- 
orem prover that understands the structure of types and is able to produce such 
proofs. The “linear size of proofs” discussed in this paper excludes the size of the 
strengthening proofs. These are not necessarily large but a description of their 
structure is beyond the scope of this paper. 

5.3 Decode Proof-Generation 

Proofs involving the decode relation can be hard to generate since the def- 
inition itself is quite involved. Our decode-prover (see figure 2) is a Twelf 
logic program that analyzes the machine-code stream and not only discovers 
which instruction each integer represents but also produces a proof of this fact. 
More concretely, if integer n represents instruction i, we get a proof of state- 
ment instruction(n, z) from which a proof of decode(m, zc, z, s) follows triv- 
ially (given a proof that n = m{w)). The decode-prover for the toy machine 
is about 600 lines of Twelf, currently hand written. We plan to generate the 
decode-prover itself from the SLED specification of each machine. Note that 
the decode-prover is not part of the TCB; any bug in it will simply produce 
an invariant from which it will be impossible to show preservation. 



6 Related Work 

There has been a large amount of work in the area of proofs of machine language 
programs using both first order [14] and higher order logics [15] [16]. Some of this 
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work was focused on proving the correctness of the compiler or the code generator 
(see for instance [13]). For a historical survey see Calvert [18]. The practice of 
proving the Hoare rules as lemmas (see subsection 5.1 and 5.2) in an underlying 
logic is widespread among the program- verification community [15] [16] [17]. 

Two pieces of work are most related to ours: Wahab [15] is concerned with 
correctness (not just safety) of assembly language programs. He defines a flow- 
graph language expressive enough to describe sequential machine code programs 
(he deals with the Alpha AXP processor). Substitution is a primitive operator 
and the logic contains rules detailing term equality under substitution. He proves 
the Hoare-logic rules as theorems and uses abstraction in order to massage the 
code stream and get shorter correctness proofs. The translation from machine 
code to the flow-graph language does not go through a “decode” relation. Also 
the use of substitution as a primitive makes this approach unsuitable for our 
purposes since it complicates the TCB. 

Boyer and Yu [14] formally specify a subset of the MC68020 microprocessor 
within the logic of the Boyer-Moore Theorem Prover [19], a quantifier-free first 
order logic with equality. Their specification of the step relation is similar to ours 
(they also include a decode relation) but in their approach these relations are 
functions. The theorem prover they use allows them to “run” the step function 
on concrete data (i.e. once the step function is specified they automatically have 
a simulator for the CPU). Their logic, albeit first-order, appears to be larger 
than ours mainly because of its wealth of arithmetic operators (decoding can be 
done directly from the specification). Also their machine descriptions are larger 
than ours; the subset of the 68020 machine description is about 128K bytes while 
our description of the Sparc is less than half that size. Admittedly, the Motorola 
chip is much more complex than the Sparc, but we suspect that most of the 
size difference is attributed to our extensive use of factoring facilitated by higher 
order logic. 



7 Conclusion and Future Work 

We have shown how higher-order logic can be used to succinctly describe the 
syntax and semantics of machine instructions, in a manner that preserves the 
natural factoring of each architecture. Our step relation formally captures the 
notion of a single instruction execution. It consists mainly of two pieces: (1) the 
decode relation that specifies the syntax of machine instructions, and (2) ax- 
ioms describing the semantics of each instruction by predicates mapping machine 
states to machine states. The decode relation is generated automatically from 
existing compiler tools. Large parts of the safety proof involving decode can 
be generated completely automatically. We explained how to build Hoare-logic 
predicate transformers from our step relation in order to simplify the construc- 
tion of the global invariant, and how lemmas can be used to minimize the size 
of safety proofs involving this invariant. The system is implemented in Twelf [4] 
and all theorems have been mechanically checked. 
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We are building a PCC system that will be used to generate safety proofs for 
many different architectures. Building all the pieces of figure 2 for each machine 
would be a daunting and unrewarding task. We instead intend to generate most 
of the prover components shown in figure 2 completely automatically. Since the 
decode-prover is in essence a machine-code disassembler, we intend to gener- 
ate it directly from the decode relation of each machine or alternatively from 
each machine’s SLED specification. Note that the decode-prover not only dis- 
assembles but also builds proofs involving decode. The invariant-generator 
is again machine-instruction dependent and can also be generated directly from 
decode (we already generate the predicate transformers expressing the weakest 
precondition for each instruction automatically from step). It is our intention 
to automatically generate the Hoare-logic lemmas (of subsection 5.1) along with 
their proofs from step since there will be a large number of them and their 
proofs tend to be rather long. The proof of preservation (see figure 3) requires 
an inversion lemma for decode. We have not proved this lemma for any machine 
yet, but we expect the proof to be mundane and long (linear in the size of the 
instruction set). Our plan is to generate these proofs from decode. Finally we are 
working on a tactical theorem prover that will fill in parts of the proofs involving 
compiler inserted invariants at locations of backward branches (see subsection 
5.2). 
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Abstract. The ability of a theorem prover to generate explicit deriva- 
tions for the theorems it proves has major benefits for the testing and 
maintenance of the prover. It also eliminates the need to trust the cor- 
rectness of the prover at the expense of trusting a much simpler proof 
checker. However, it is not always obvious how to generate explicit proofs 
in a theorem prover that uses decision procedures whose operation does 
not directly model the axiomatization of the underlying theories. In this 
paper we describe the modifications that are necessary to support proof 
generation in a congruence-closure decision procedure for equality and in 
a Simplex-based decision procedure for linear arithmetic. Both of these 
decision procedures have been integrated using a modified Nelson-Oppen 
cooperation mechanism in the Touchstone theorem prover, which we use 
to produce proof-carrying code. Our experience with designing and im- 
plementing Touchstone is that proof generation has a relatively low cost 
in terms of design complexity and proving time and we conclude that the 
software-engineering benefits of proof generation clearly outweighs these 
costs. 



1 Introduction 

There are several reasons why a theorem prover ought to produce easily check- 
able derivations of the formulas it proves. First, that way the soundness of the 
theorem prover does not have to be trusted since it is reduced to the soundness 
of a much simpler proof checker. This allows theorem proving tasks to be dele- 
gated to anonymous or even untrusted parties, such as remote proving servers, 
without loss of confidence in the result. On the software-engineering side, the 
testing and maintenance of a proof-generating theorem prover can be simplified 
considerably at the cost of implementing a simple proof checker. Our initial mo- 
tivation for developing such a theorem prover was to assist with the generation 
proof-carrying code [Nec97], in which an explicit proof of safety is attached to 
mobile code to allow a code receiver to verify easily the compliance of the code 
with a safety policy. 
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The complexity of proof generation in a theorem prover depends on the prover 
design. For example, a simple theorem prover can be written as an interpreter 
for a logic program consisting of a transcription of axioms and inference rules. In 
fact, the first implementation of proof-carrying code used the Elf [Pfe94] system 
to search for proofs when the logic was expressed as an LF signature, in the style 
described in [Pfe91]. For such a theorem prover it is a simple bookkeeping task 
to record the proof as the sequence of the inference rules used on the successful 
search path. 

The problem is complicated somewhat in theorem provers based on decision 
procedures, such as PVS [ORS92] or Simplify [DLNS98], because of the indi- 
rect relationship between the decision algorithm, sometimes described in terms 
of graphs [Sho81] or matrices [Nel81], and the axiomatization of the theories 
involved. In this paper we describe an extension for proof generation of the 
Nelson-Oppen cooperating decision-procedures model and then we show how 
to implement proof generation in a congruence closure decision procedure for 
equality and in a Simplex-based decision procedure for linear arithmetic. We 
implemented these decision procedures along with a few others in the Touch- 
stone theorem prover that we use in our proof-carrying code experiments. One 
noteworthy feature of our implementation is that proofs of intermediate subgoals 
are generated lazily and only if they turn out to be on the successful proof search 
path. With this optimization the overhead of proof generation is a 30% increase 
in the size of the prover source code and a 15% increase of proving time. 

Proof generation or logging appears in various forms in other theorem 
provers as well. In LCF-style tactic-based provers (e.g. Isabelle [Pau94] and 
HOL [Gor85]) the lack of decision procedures allows a simple implementation of 
proof logging in the form of a trace of the successful proof search path. In theorem 
provers that do use decision procedures (e.g., PC-Nqthm [BM79], PVS [ORS92], 
Simplify [DLNS98]) most often the prover records only the user input and the 
invocations of the decision procedures, to allow batch-mode proof playback. This 
means that the implementations of the decision procedures must be trusted since 
they are also part of the proof checker. In addition to Touchstone, a select num- 
ber of other theorem provers combine decision procedures for efficiency and proof 
generation for assurance. One of them is the Stanford Validity Checker [SD99], 
which uses a different set of decision procedures and the Shostak method for 
integrating decision procedures instead of the Nelson-Oppen method discussed 
here. 

A more closely related result is Boulton’s integration [Bou93,Bou95] of a 
fully-expansive implementation of the Nelson-Oppen method in the HOL the- 
orem prover [Gor85]. While we used some of the same techniques as Boulton 
(such as the lazy generation of proof objects [Bou92]), our work is different from 
Boulton’s in two respects. Boulton chooses to use a version of Fourier-Motzkin 
elimination for deciding linear arithmetic formulas, in order to simplify the task 
of generating proofs ([Bou93], page 80). We have opted for a more complex but 
apparently more efficient decision procedure based on the Simplex algorithm. A 
second difference is that Boulton uses a functional programming style in order to 
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implement easily the required undo feature of decision procedures. We decided to 
use an imperative style and program the undo feature explicitly in order to have 
better control on the memory usage. As a result, we were able to implement the 
undo feature for the Simplex algorithm with a very small cost by not actually 
reverting the data structures to their original form but to another one that is 
equivalent. This, coupled with the modifications that are required to the linear- 
programming version of Simplex to make it usable in a Nelson-Oppen prover 
complicates the proof generation problem for Simplex. We show a solution to 
this problem in Section 3.2. 

In addition to presenting the particular techniques that we use to generate 
proofs from decision procedures, a substantial part of this paper summarizes our 
experience in building and using the Touchstone theorem prover for producing 
proof-carrying code. We discuss both the additional programming-complexity 
cost of proof generation along with the benefits of proof generation for debug- 
ging and maintaining the prover. In fact, we show that the ability to generate 
proofs and thus to check easily each run of the prover allowed us to use aggres- 
sive implementation techniques in order to gain efficiency. Our measurements 
show that Touchstone appears to be faster than Boulton’s implementation of 
the Nelson-Oppen strategy in the HOL theorem prover. In order to achieve this 
we had to adopt a more aggressive imperative programming style which led to a 
number of subtle design and programming errors that could have been avoided 
in a purely functional implementation. However this did not turn out to be a reli- 
ability problem because the proof checker quickly pointed out our programming 
errors. 

Considering the complexity of implementing proof generation and the run- 
time cost of synthesizing proofs on one hand and, on the other hand, the number 
of design and implementation errors that were uncovered by proof checking dur- 
ing testing and maintenance along with the added value of the theorem prover 
as a proof-carrying code generator, we strongly advocate that theorem provers 
ought to generate easily checkable proofs. 

2 Overview of the Touchstone Theorem Prover 

Touchstone has a modular design based on a strategy for combining decision 
procedures first described by Nelson and Oppen [N079]. The innovation in 
Touchstone lies in a modification of the Nelson-Oppen strategy to allow for 
proof-generating decision procedures and also in the techniques used to generate 
proofs in individual decision procedures. In this paper we discuss such techniques 
for the congruence closure decision procedure for equality and a Simplex-based 
decision procedure for linear arithmetic. 

Touchstone handles the fragment of first-order logic shown in Figure 1, where 
the languages of literals L and expressions E can be extended with additional 
operators or function symbols. There are two motivations for restricting our- 
selves to such a small subset of first-order logic formulas. First, this fragment is 
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Goals G::=L \ T | Gi A G2 | HoG \ Vx.G 

Hypotheses ::= L \ T \ Hi h H 2 \ \ H 2 \ ^x.H 

Literals L ::= E\ = E 2 \ Ei 7^ E 2 \ p{Ei , . . . , E„) 

Expressions -E n | Ei + E 2 \ f{Ei,...,E„) \ ■■■ 

Fig. 1. The syntax of formulas handled by Touchstone. 



sufficient for expressing verification conditions for programs whose loop invari- 
ants and function pre/postconditions are themselves restricted to the language 
H of hypotheses. This is true in all applications to date of proof-carrying code 
where, in fact, we currently use only conjunctions of literals as hypotheses. Sec- 
ondly, this fragment of intuitionistic logic has the convenient property that all 
inference rules are invertible and thus we can use a very simple yet complete 
inversion proof-search procedure without any disjunctive or existential choices. 
In essence, the hard part of the proving task in this fragment of logic lies with 
the decision procedures that handle goal literals. The prover can be extended to 
handle more logical connectives all the way to higher-order hereditary Harrop 
formulas following, for example, the strategies described in [Mil91] or [MNPS91]. 

A decision procedure for a given theory T in Touchstone knows how to decide 
whether a set of literals entails another literal, in the case when all of the lit- 
erals involved contain only function symbols from T. Most decision procedures 
in Touchstone are implemented in terms of satisfiability procedures that can 
detect when a set of literals is unsatisfiable. In practice, goal formulas contain 
literals from multiple theories and, although necessary, it is not sufficient to have 
decision procedures for these isolated theories. Furthermore, combining decision 
procedures is not as straightforward as it might seem. To illustrate this point 
consider the theory Q of rational numbers with the free symbols -I-, — , > and the 
numerals along with the usual axioms of rational arithmetic. Consider also the 
theory E with one uninterpreted unary function symbol “f ” . The satisfiability 
problems for each of these theories considered separately were solved long ago 
by Fourier for Q and by Ackermann for E [Ack54]. Consider now the following 
goal from the combined theory Q -|- E^ : 

f (f (x) — f (y)) yf f(z) A y>x A x>y + z A z>0 (1) 

Informally, to demonstrate that the above set of literals is not satisfiable, we 
would first use the two literals in the middle to infer in Q that “0 > z” and 
then the last literal to demonstrate that “z = 0” and hence also that “x = y”. 
Then, we use the congruence rule of E to infer that “f(a;) = f(y)”. Then we 
move again in Q to prove that “f(a;) — f(y) = z” and then back to E to prove 
that “f(f(x) — f(y)) = f(z)”. This allows E to detect the contradiction with 
the first literal and to declare that the set of literals is not satisfiable. This 
example demonstrates that, in general, the decision procedures must interact in 
a non-obvious way to detect unsatisfiability. 

This example is taken from [Nel81]. 
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Fig. 2. The overall structure of the Touchstone theorem prover. 



Nelson and Oppen show that it is enough for each decision procedure to 
broadcast only the contradictions and the equalities between variables that it 
discovers. This simple cooperation mechanism is shown in [Nel81] to be com- 
plete when the theories involved are convex. A theory is not convex if it has a 
set of literals that entails a proper disjunction of equalities between variables 
without entailing any single equality. For example, the theory Z of integer linear 
arithmetic is not convex since y = z+1 A y > x A x> z entails x = y \J x = z. 

The Nelson-Oppen architecture can be adapted to deal with non-convex the- 
ories by performing a case-split whenever a conjunction of literals entails a dis- 
junction. Informally, the prover tries to guess which one of the disjuncts holds 
and asserts it to all decision procedures. If this does not lead to unsatisfiabil- 
ity then the next disjunct is tried. For this procedure to be correct there are 
additional technical requirements that the theories must satisfy as explained 
in [Nel81]. 

The structure and operation of Touchstone is shown in Figure 2. Input goals 
are first broken into literal goals and assertions by the “Inversion” module. Hy- 
pothesis literals and negated goal literals are asserted along with their proofs to 
the “Dispatch” module that essentially implements a broadcast medium between 
decision procedures. Each decision procedure receives either proved asserted lit- 
erals or proved equality of variables discovered by other decision procedures. 
Decision procedures can also discover a contradiction, which is then propagated 
to the “Inversion” module. The “Subgoal” module is discussed later. As an op- 
timization, equalities discovered and broadcasted by decision procedures are not 
accompanied by an actual proof but only by a token that identifies the origina- 
tor decision procedure. Proofs are produced on demand only if the equality is 
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actually used in generating a contradiction. This optimization is similar to the 
one described by Boulton [Bou92]. 

The “Inversion” module is fairly simple mostly due to the limited fragment 
of first-order logic that it has to handle. Informally, goals are first broken using 
the appropriate introduction rules. When the goal is a literal its negation is 
asserted to the “Dispatch” module. Other sources of assertions are the left-hand 
sides of assertions, which are broken using elimination rules into a sequence of 
literals to be asserted. As discussed before, the “Dispatch” module broadcasts 
the assertions received from the “Inversion” module to all decision procedures 
and expects them to return either a contradiction or a set of entailed equalities 
between variables. This proof procedure is complete for the fragment of logic 
that we consider here. The completeness at the level of literals is ensured by the 
correctness theorem for the Nelson-Oppen strategy for convex theories. 

In order to generate proofs several changes have to be made: 

— The “Inversion” module must keep track of the introduction rules that it 
uses while breaking up the goal and the elimination rules that it uses while 
breaking up the assertion. 

— The “Dispatch” module and consequently all decision procedures must re- 
ceive the asserted literals accompanied by a proof. 

— A decision procedure that discovers an equality must also be able to produce 
a proof of that equality, possibly in terms of the proofs accompanying the 
assertions that it received previously. 

— Furthermore, a decision procedure that discovers a contradiction must be 
able to exhibit a proof of falsehood. 

All proof objects maintained in the system are represented as terms in the 
Edinburgh Logic Framework (LF) [HHP93], which conveniently allows higher- 
order proof representations, so we can use the simple LF type checker as a proof 
checker. 

We decided to use an imperative implementation of the system so that each 
decision procedure can maintain state without having to pass it around as it 
would be necessary in a purely functional implementation. To ensure proper 
scoping of assertions and thus to maintain the soundness of the prover, the in- 
version module announces when decision procedures must “forget” certain asser- 
tions they have received. This is implemented by programming the “Inversion” 
module to issue pair of commands snapshot and undo. These commands are 
broadcast to decision procedures by the “Dispatch” module. The intended se- 
mantics of the undo operation is that all decision procedures should adjust their 
internal data structures so that they do not reflect assertions that were made af- 
ter the matching snapshot operation. This ensures that assertions are properly 
retracted from the system at the time of the undo. 

2.1 The Subgoal Module 

Touchstone deviates from the Nelson-Oppen architecture described in [NelSl] by 
allowing the use of decision procedures based on tactics. A tactic-based decision 
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procedure can, in addition to detecting contradictions and equalities between 
variables, announce subgoals that, if proved, would allow the decision procedure 
to detect a contradiction. Each such subgoal is announced to the “Subgoal” 
module along with a function that given a proof of the subgoal generates a proof 
of falsehood. We refer to this function as the proof transformer. The “Subgoal” 
module is used extensively in Touchstone for the implementation of various de- 
cision procedures for type checking, as explained in Section 4. 

Touchstone attempts to prove tactic-generated subgoals only when the cur- 
rent proof is about to fail. If the “Inversion” module notices that no contradic- 
tion is announced after it asserts the negation of a goal literal then it queries 
the “Subgoal” module for a list of subgoals that were announced by tactic-based 
decision procedures. The “Inversion” considers these subgoals in turn and if any 
one of them is proved it can use the associated proof transformer to generate 
the desired contradiction. 

An added benefit of the “Subgoal” module is that non-convex decision proce- 
dures can be incorporated quite naturally. If a decision procedure cannot discover 
an equality but it does discover a proper disjunction of equalities it is in a po- 
sition to announce a subgoal consisting of a proper conjunction of disequalities. 
This will lead the “Inversion” to perform a case-split and to try to prove all dis- 
equalities independently, which is exactly the behavior desired in the presence 
of non-convex theories. 

This completes the description of the modules responsible with the control of 
the decision procedures. In the next section we describe some general principles 
for implementing proof-generating decision procedures and then we examine in 
more detail two of the decision procedures of Touchstone. 

3 Proof-Generating Decision Procedures 

All decision procedures in Touchstone use the same internal representation of 
literals in the form of a global expression directed acyclic graph, or the E-DAG. 
The E-DAG contains a node for each unique subexpression of the goal formula. 
In addition to the global E-DAG, each decision procedure is free to maintain its 
own internal state. 

Each decision procedure is required to implement at least three functions: 

— The assert function that given a literal and its proof asserts the literal to 
the decision procedure. This function should return a list of equalities that 
were discovered along with their proofs, or a contradiction along with a proof 
of falsehood. As a side-effect this function can also announce subgoals along 
with their proof transformers to the “Subgoal” module. 

— The snapshot and undo functions, as described above. 

There are at least a couple of ways to implement the snapshot and undo 
operations. The simplest way is for each decision procedure to maintain a stack 
of the input assertions. Each time a new assertion arrives it is considered with 
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Ej 2 = El El = E2 E2 = Ez 

eqid eqsym eqtr 



E = E El = E2 

El = E2 El 7 ^ E2 

falsei 



El = E3 
El — E[ ■ ■ ■ En = E!^ 



± f{Ei, . . . , En) — f{E[, . . . , E'„) 

Fig. 3. The axioms of the theory E of equality. 



congr 



respect to the assertions already memorized on the stack for the purpose of de- 
tecting equalities or announcing a contradiction. Each snapshot places a marker 
on the stack and each undo pops the stack up to the nearest marker. This sim- 
ple strategy is typical of the decision procedures that use backward chaining. In 
Touchstone, the modular arithmetic and the typing decision procedures use this 
simple strategy. 

Another strategy is used by the forward-chaining decision procedures. These 
typically maintain internal data structures that reflect the current assertions 
in an internal form. As new assertions arrive they are internalized in the data 
structure and equalities are propagated. To be usable in Touchstone, such deci- 
sion procedures must be able to revert their data structures to the state at the 
matching snapshot. Decision procedures can implement the undo operation by 
using non-mutable data structures or by maintaining a list of the destructive 
operations performed on the state so that they can be undone. The latter strat- 
egy is used in Touchstone by the congruence closure and the Simplex decision 
procedures. 

3.1 Proof Generation in the Congruence Closure Algorithm 

A central theory in any implementation of the Nelson-Oppen architecture is the 
theory of equality. The free functions of the theory E are “=” and “yf” along 
with any uninterpreted function symbols. The axioms of the theory are those 
shown in Figure 3. There is one congruence rule for each function symbol in the 
system. 

The theory E was first shown decidable by Ackermann [Ack54] by reducing 
the problem to that of constructing the congruence closure of a relation on 
a graph. If TZ is an equivalence relation over a set of terms, we say that two 
terms f (ti, . . . , and , t'^) are congruent if U is related to t[ by TZ 

for all i = 1,... ,n. The congruence closure of a relation TZ is the smallest 
extension of TZ that is both an equivalence relation and relates all its congruent 
terms. To see if a given equality H = u” follows from a set of equalities TZ, 
we first construct the congruence closure TZ' of TZ and then check to see if t = 
u € TZ' . This is the sense in which an algorithm for computing the congruence 
closure of a set of equalities can be at the base of a decision procedure for E. 
The implementation of congruence closure in Touchstone is a proof-generating 
extension of that described in [NelSl]. 

In addition to the E-DAG, the congruence closure algorithm uses its own in- 
ternal data structures to represent the congruence closure of the current equality 
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assertions. Thus, a mapping root is maintained to map each subexpression to a 
representative for its equivalence class. A mapping forbid maps a class repre- 
sentative C to a set of nodes that are known to be distinct from C. Finally, the 
set of incoming equality and inequality assertions are stored on an undoStack 
along with their proofs. Additionally, whenever a congruence is discovered, the 
corresponding equality along with its proof is pushed on the undoStack. 

For the purposes of discussing the proof generation strategy we do not need 
to see the whole implementation but just the invariant that it maintains. This 
invariant is shown below, with the notation class{a) denoting the set of nodes 
with the same root as a: 

Cl. root(a) = root(b) if and only if a = 5 or there exist oi, . . . , a„+i such that 

— a = oi, b = a„+i, and 

— (oi = Oi+i,pfj) G undoStack or (oj+i = Oi,pfj) G undoStack for all 
i = 1, . . . ,n 

C2. class{a) n f orbid(root(b)) yf 0 if and only if there exist a' G class{a) and 
b' G class{b) such that (o' yf b', neqab) G undoStack. 

Based on this invariant we can define a function prf Eq(a, b) that given two 
expressions with the same root produces a proof of their equality, and also a 
function mkEqContra(a, b, eqab) that given two expressions as in Invariant C2 
along with a proof of their equality, produces a proof of falsehood, as shown 
below. 

prfEq(a : node, b : node) = /* root(a) = root(6) */ 

if a = b then return eqid(a) 
let oi, . . . , ttn+i as in the invariant Cl 
pj', _ I P'fi = Oi+i, p/i) G undoStack 

I eqsym(p/ j) if (oi+i = ai,pfi) G undoStack 
return eqtr(p/), eqtr(p/ 2 , . . . , eqtr(p/'„_i, p/'„) . . . )) 

mkEqContra(a, b, eqab) = /* class{a) n f orbid(root(6)) yf 0 */ 

let a' G class{a),b' G class{b) such that (o' yf b', neqab) G undoStack 
return f alsei(eqtr(prf Eq(a', a), eqtr(e(/o&, prf Eq(6, 6'))), neqab) 

The decision procedure operates as follows. When an equality a = 6 is as- 
serted we first check the Invariant C2 condition to see if we detected a contradic- 
tion, in which case we use mkEqContra to generate the proof of falsehood required 
for the Contradiction exception. Otherwise we push the asserted equality along 
with its proof on the undoStack and we merge the classes of a and b updating the 
forbid sets accordingly to preserve the invariant. Finally, we check for newly in- 
troduced congruences. Any such congruence is both an equality to be announced 
to other decision procedures and an input to a recursive invocation of the merge 
procedure. For this latter step we use the congr rule to generate the appropriate 
proof. 

When a disequality assertion a yf 5 along with its proof neqab is encountered 
we first check whether a and b are equivalent. If they are then we announce 
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a Contradiction with the proof f alsei(prf Eq(a, b), neqab). Otherwise we just 
add {(a ^ b, neqab)} to the undoStack and we update the forbid sets of both 
a and b. 

Note that congruence closure is a convex decision procedure and thus, it does 
not need to create case splits. Also, as proven in [NO80] and [Nel81] for similar 
implementations, the congruence closure algorithm is a sound and complete deci- 
sion procedure for E. The algorithmic complexity of the algorithm is determined 
by the method used to discover the congruent pairs of nodes. In Touchstone this 
is done with a simple algorithm of complexity O(n^), where n is the number 
of nodes in the E-DAG. The complexity can be reduced to 0{nlogn) using the 
more complex strategy described in [DST80]. The proof generation extensions 
do not affect the algorithmic complexity. For more implementation details the 
reader is invited to consult [Nel81,Nec98]. 



3.2 Proof-Generation in the Simplex Algorithm 

Now we turn our attention to the decision procedure for the theory Z of integer 
numerals along with the operators As a notational convenience 

we also consider multiplication by integer numerals, which we write using the 
infix • operator. The decision problem for Z is essentially the problem of de- 
ciding whether one linear inequality is a consequence of several other inequal- 
ities. There are several decision procedures for Z in the literature. Some cover 
only special cases [AS80,Pra77,Sho81] while others attempt to solve the general 
case [Ble74,Nel81]. Here we will consider a general decision procedure based on 
the Simplex algorithm for linear programming, as described in [Nel81]. Like most 
decision procedures for Z, Simplex is not complete since it works essentially with 
rational numbers. However, Simplex is powerful enough to handle the kinds of 
inequalities that typically arise in program verification. For space reasons, we 
discuss here only on the very basic properties of the Simplex algorithm used 
as a decision procedure and we focus on the modifications necessary for proof 
generation. The reader is invited to consult [Nec98,Nel81] for more details and 
a complete running example. 

The internal data structure used by the Simplex algorithm is the tableau, 
which consists of a matrix of rational numbers qij with rows i G 1 . . r and 
columns j G 1 . .c along with a vector of rational numbers qto. All rows and 
columns in the tableau are owned by an expression in the E-DAG. We write 
R{i) to denote the owner of row i and C{j) to denote the owner of column j. 
The main property of the tableau is that each row owner can be expressed as a 
linear combination of column owners, as follows: 

C 

R{i) - 9io + X! 9b • C'(i) (i = 1 ■ ■ J-) (2) 

i=i 

We use the ~ notation to denote equality between symbolic expressions modulo 
the rules of the commutative group of addition. 
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The Simplex tableau as described so far encodes only the linear relationships 
between the owning expressions. The actual encoding of the input inequality 
assertions is by means of restrictions on the values that owning expressions can 
take. There are two kinds of restrictions that Simplex must maintain. A row or 
a column can be either + -restricted, which means that the owning expression 
can take only values greater or equal to zero, or * -restricted, in which case the 
owning expression can only be equal to zero. 

To illustrate the operation of Simplex as a decision procedure for arithmetic 
consider the task of detecting the contradiction in the following set of literals: 

{1 — s>0, 1 — y>0, — 2 + a; + y>0, — 1 + a; — y>0} 

When Simplex asserts an inequality, it rewrites the inequality as e > 0 and 
it introduces a new row in the tableau owned by e. To simplify the notation 
we are going to use the names si, . . . ,54 for the left-hand sides of the our four 
inequations. After adding each row. Simplex attempts to mark it as -I— restricted 
(since the input assertions comes with a proof that the owning expression is 
positive). But in doing so the tableau might become unsatisfiable, so Simplex 
first performs Gaussian elimination (called pivoting in Simplex terminology) to 
increase the entry in column 0 to a strictly positive value. If it is possible to 
bring the tableau in a form where each -I— restricted row has a positive entry in 
column 0, then the tableau is satisfiable. One particular assignment that satisfies 
all -I— restrictions is obtained by setting all variables or expressions owning the 
columns to zero. 

In the process of adding the third inequality from above the tableau is as 
shown below in position (a) below. The row owned by S3 cannot be marked 
as -I— restricted because (730 =—2^0. To increase the value of q^o Simplex 
performs two pivot operations (eliminating x from si and y from S2) and brings 
the tableau in the state shown below in position (b). 
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Now Simplex can safely mark the row S3 as -I— restricted and, if it were not for 
the need to detect all equalities between variables, it could proceed to process 
the fourth inequality. In order to detect easily all inequalities Simplex must 
maintain the tableau in a form in which all owners of rows and columns that are 
constrained to be equal to zero are marked as ^-restricted. It turns out that in 
order to detect rows that must be made ^-restricted it is sufficient to look for -I— 
restricted rows that become maximized at 0. A row i is said to be maximized at 
Qio when all its non-zero entries are either in ^-restricted columns or are negative 
and are in -I— restricted columns. One such row is S3 in the tableau (b). Since 
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Si and S 2 are known to be positive and S 3 is a negative-linear combination of 
them, it follows that S 3 < (730 = 0. On the other hand after processing the third 
assertion we know that S 3 > 0 which leads Simplex to decide that S 3 is both 
-I— restricted and maximized at 0, hence it must be equal to zero. Furthermore 
all non-zero entries in the row of S 3 must now be in *-restricted columns. Thus 
Simplex marks all of si, S 2 , and S 3 as *-restricted rows, bringing the tableau 
in the state shown above in position (c). In this state Simplex notices that the 
rows of X and y differ only in ^-restricted columns (which are known to be zero), 
hence it announces that x is equal to y. 

Finally, when the fourth assertion is added, the tableau becomes as shown 
in position (d) above. Now Simplex notices that S 4 is maximized at —1 and 
consequently that it is impossible to increase the value of (740 by pivoting. Since 
Simplex also knows that S 4 > 0 (from the fourth assertion), it has discovered a 
contradiction. 

In the rest of this section we discuss how one can extract proofs of variable 
equalities and contradictions from the Simplex tableau. Before that let us point 
out that the implementation of the undo feature in Simplex does not have to 
revert the tableau to the original state at the time of the matching snapshot. 
Instead, all it has to do is to remove the rows, columns and restrictions that 
were added since then. This is much less expensive than a full undo. 

The operation of the Simplex algorithm maintains the following invariants: 

51. If R{i) is restricted (either -I— restricted or *-restricted) then there is a proof 
of R{i) > 0. We refer to this proof as Proof (R{i)). Similar for columns. 

52. If R{i) is -I— restricted then qi>0 

53. If R{i) is ^-restricted then qi = 0 and qtj yf 0 implies C'(j) is *-restricted 

54. If C{j) is ^-restricted then there exists ^-restricted R{i) such that qtj < 0 
and qik < 0 for all k > j. We say that row i restricts column j. 

Only Invariant SI is introduced solely for the purpose of proof generation; 
the others are necessary for the correct operation of Simplex. 

Simplex detects contradictions when it tries to add a -I— restriction to a row 
i that is maximized at a negative value qio. The key ingredient of a proof of 
falsehood in this case is a proof that R{i) < qio- Simplex constructs this proof 
indirectly by constructing first a proof that R{i) = qio + E and then a proof 
that E < 0. Furthermore, Simplex chooses E to be the expression constructed 
as the linear combination of column owners as specified by the entries in row 
i. If we ignore the ^-restricted columns, then all of the non-zero entries in the 
maximized row i are negative and are in columns j that are -I— restricted. For 
each such column, according to Invariant SI, there is a proof that C{j) > 0. 

To construct these proofs Simplex uses the inference rules shown below: 

Ri > 0 Ri = q + E E < 0 
-j- sfalse {q < 0) 

E\ ^ 0 E 2 ^0 

arith (Ei ~ E 2 ) — geqadd {q > 0) 

El + q ■ E 2 < U 



El = E2 
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mapRow(i) = ^ ^ 0 

foreach k = 1 . . c such that qik < 0 dp <!>{C{k)) « — h qik 
foreach k = 1 . . c such that qik > 0 dp ^ = mapCol(fc, qik , <!’) 
return <1> 

mapCol(j, q, <1>) = let row i be the restictor of j as in invariant S4 {qij <0) 

foreach k ^ j such that qik < 0 dp <P{C{k)) < — | q ■ qik /qij 

foreach k such that qik > 0 dp <1> ^ mapCol(fc, —q ■ qik /qij, >P) 

return <1> 

Fig. 4. Extracting coefficients from the Simplex tableau. 



The sfalse rule is used to generate contradictions as explained above. Its 
first hypothesis is obtained directly from the proof of the incoming inequality 
R{i) > 0. The second hypothesis is constructed directly using the rule arith 
whose side-condition holds because of the main tableau representation invariant. 
Finally, the third hypothesis is constructed using repeated use of the geqadd 
rule for all the elements of the negative linear combination of restricted column 
owners, as read from the row i in the tableau. It is a known fact from linear 
algebra that a contradiction is entailed by a set of linear inequalities if and 
only if a false inequality involving only numerals can be constructed from a 
positive linear combination of the original inequalities. Thus these rules are not 
only necessary for Simplex but also sufficient for any linear arithmetic proof 
procedure. 

The situation is somewhat more complicated due to the presence of *- 
restricted columns that might contain strictly positive entries in a maximized 
row (such as is the case in the row 54 of tableau (d) shown before). To solve 
this complication we must also be able to express every ^-restricted column as a 
negative linear combination of restricted owners. Simplex uses the two functions 
mapRow and mapCol shown in Figure 4 to construct negative linear combinations 
for a maximized row or a ^-restricted column. In Figure 4 the notation <P denotes 
a map from expressions to negative rational factors. All integer numerals n are 
represented as the numeral 1 with coefficient n. We use 0 to denote the empty 
map. The operation ’P{E) < — h q updates the map (P increasing the coefficient of 
Ehy q. 

The reader is invited to verify using the Simplex invariants that if mapRow 
is invoked on a maximized row and if mapCol is invoked on a *-restricted col- 
umn with a positive factor q then the resulting map contains only restricted 
expressions with negative coefficients. The termination of mapCol is ensured by 
the Invariant S4. 

The reader can verify that by running mapRow (4) on the tableau (d) we 
obtain that S4 ~ —1 — 2 • si — 1 • S3, which in turn says that the negation of 
the fourth inequality can be verified by multiplying the first equality by 2 and 
adding the third equality. 
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The Simplex equality proofs are generated using the inference rule seq shown 
below. 

Ei-E2 = E E<Q E2-Ei = E' E' < 0 



Consider for example the case of two rows ii and Z 2 whose entries are distinct 
only in *-restricted columns. We temporarily add to the tableau a new row r 
whose entries are — qi^j- Since row r is maximized at 0 we can use mapProof 
to produce the first two hypotheses of seq, just like we did for sfalse. Then 
we negate the entries in r and we use again mapProof to produce the last two 
hypotheses. 

4 Lessons Learned while Building and Using Touchstone 

In this section we describe our initial experience with building and using the 
Touchstone theorem prover. We programmed the theorem prover in the Standard 
ML of New Jersey dialect of ML. The whole project, including the control core 
along with the congruence closure and Simplex decision procedures and also with 
decision procedures for modular arithmetic and type checking, consists of around 
11,000 lines of source code. Of these, about 3,000 are dedicated solely to proof 
representation, proof generation, proof optimization (such as local reduction and 
turning proofs by contradiction into simpler direct proofs) and proof checking. 

The relatively small size of the proof generating component in Touchstone 
can be explained by the fact that most heuristics and optimizations that are 
required during proving are irrelevant to proof generation. Take for example the 
Simplex decision procedure. Its implementation has over 2000 lines of code, a 
large part of which encodes heuristics for selecting the best sequence of pivots. 
The proof-generating part of Simplex is a fairly straightforward reading of the 
tableau once the contradiction was found, as shown in Figure 4. 

One important design feature of proof generation in Touchstone is that proofs 
are produced lazily, only when a contradiction is found. This is similar in spirit 
with the lazy techniques described by Boulton [Bou92]. We do not generate ex- 
plicit proofs immediately when we discover and propagate an equality. Instead we 
only record which decision procedure discovered the equality and later, if needed, 
we ask that decision procedure to generate an explicit proof of the equality. This 
lazy approach to proof generation means that the time consumed for generating 
proofs is small when compared to the time required for proving since in most 
large proving tasks a large part of the time is spent exploring unsuccessful paths. 
Our experiments with producing type safety proofs for assembly language show 
that only about 15% of the time is spent on proof generation. 

It was obvious from the beginning of the project that there will be a design 
and coding complexity cost to be paid for the proof-generating capability of the 
prover. We accepted this cost initially because we needed a mechanical way to 
build the proofs required for our proof-carrying code experiments. We did not 
anticipate how much this feature would actually simplify the building, testing 
and maintenance of the theorem prover. Indeed, we estimate that the ability to 
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cross-check the operation of the prover during proof generation saved us many 
weeks or maybe months of testing and debugging. 

One particular aspect of the theorem prover that led to many design and 
programming errors was that all decision procedures and the control core must 
be incremental and undoable. This is complicated by decision procedures that 
perform a pseudo-undo operation in the interest of efficiency. For example, the 
Simplex decision procedure does not revert the tableau to the exact state it was 
at the last snapshot operation but only to an equivalent state obtained simply 
by deleting some rows and columns. In the presence of such decision procedures 
the exact order of intermediate subgoals and discovered equalities depends on 
all previously processed subgoals. This defeats a common technique for isolating 
a bug by reducing the size of the goal in which it is manifested. It often happens 
that by eliminating seemingly unrelated subgoals the error disappears because 
the order in which entailed equalities is changed. 

Proof-generation as a debugging mechanism continues to be valuable even 
as the prover matures. While we observe a decrease in the number of errors, we 
also observe a sharp increase in the average size of the proving goals that trigger 
an error. Indeed the size of these goals is now such that it would have been 
impractical to debug the prover just by manual inspection of a trace. Secondly, 
proof-generation gave us significant assistance with the upgrade and mainte- 
nance of the prover, as a broken invariant is promptly pointed out by the proof- 
generation infrastructure. We should also point out that for maximum assurance 
we checked proofs using the same small proof checker that we also use for proof- 
carrying code. However, we noticed that most prover errors surfaced as failures 
by the proof-generating infrastructure to produce a proof and only a very small 
number of bugs resulted in invalid proofs. A lesson that can be drawn here is 
that the software engineering advantages of proof-generation in theorem provers 
can be obtained by just going through the process of generating a proof without 
actually having to record the proof. 

4.1 Using Touchstone to Build Proof- Carrying Code 

The main motivation for building Touchstone was for use in a proof-carrying 
code system. A typical arrangement for the generation of proof-carrying code 
is shown in Figure 5. Note that on the right-hand side we have the untrusted 
components used to produce PCC while on the left-hand side we have the trusted 
infrastructure for checking PCC. 

This figure applies to the particular case in which the safety policy consists of 
type safety in the context of a simple first-order type system with pointers and 
arrays. The process starts with a source program written in a type-safe subset 
of the C programming language. The source program is given to a certifying 
compiler that, in addition to producing optimized machine code, also generates 
function specifications and loop invariants based on types. We will return to the 
issue of modeling types in first-order logic shortly. 

The code augmented with loop invariants is passed through a verification 
condition generator (VcCen) that produces a verification condition (VC). The 
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Fig. 5. A typical arrangement for building proof-carrying code. 



VC is provable only if the code satisfies the loop invariants and the specifications 
and only if all memory operations are safe. To encode proof obligations for 
memory safety in a general way, VcGen emits formulas of the form “saf erd(E)” 
to say that the memory address denoted by the symbolic expression E is readable, 
and “saf ewr (E,E’ )” to say that the value denoted by E’ is writable at the 
address denoted by E. 

The verification condition is passed then to Touchstone, which proves it and 
returns a proof encoded in a variant of the Edinburgh LF language. This allows 
an LF type checker on the receiver side to fully validate the proof with respect 
to the verification condition. The key idea behind proof-carrying code is that 
the whole process of producing a safe executable can be split into an complex 
and slow untrusted component on one side and a simple and fast trusted safety 
checker. It was therefore a key requirement that the proving and proof checking 
tasks be separated as shown in the picture. For more details on the system 
described here the reader is invited to consult [Nec98]. 

What remains to be discussed are the details of the logical theory that is 
used to model types and to derive memory safety. We use a theory of first-order 
types with constructors for types and a typing predicate. A few of the terms and 
formulas used along with three of the inference rules are shown below: 

A : array(r, L) / > 0 I < L A: ptr(pair(Ti, T 2 )) A : ptr(T) 

A -|- 4 * / : ptr(r) A -|- 4 : ptr(Ti) saferd(A) 

The first rule says that a pointer to an element of type T can be obtained 
by indexing in an array whose element type is T. In this rule the array type is 
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dependent on the length of the array and the element size is considered to be 
4 bytes. The second rule is used to reason about tuple destructors. Finally, the 
last rule is the only rule that introduces the saf erd predicate, basically saying 
that in this safety policy readability of memory locations is dictated by types. 

To handle this theory of types in Touchstone we make heavy use of the 
“Subgoal” module. We implemented a tactic that does backward chaining on the 
rules of the theory. A central element of the tactic is a heuristic that finds likely 
valid formulas of the form “A : ptr(T)” by looking at the form of A and at the 
current assertions (originating from function preconditions and loop invariants 
in this case). While the other elements of Touchstone have some completeness 
properties, the tactic for the theory of types need only be powerful enough 
to “understand” the code produced by our compiler. And since the compiler 
starts with an obviously well-typed source program the only difficulties can be 
introduced by optimizations. In fact, our compiler is very aggressive in optimizing 
array-bounds checks and hence the theorem prover must be able to prove itself 
all the arithmetic facts that the compiler has discovered and proved. As a result 
the Simplex satisfiability procedure is exercised quite heavily in this setting. 

Our experiments with Touchstone in this setting have shown several interest- 
ing facts. This separation of tasks does indeed achieve a separation of complexity 
and running cost. In terms of code size the untrusted components are about four 
times larger than the trusted ones. In particular the Touchstone prover is four 
times larger than the proof checker. Furthermore, while Touchstone grows con- 
tinuously as we incorporate more heuristics and better tactics we found that 
the proof-checking component has remained largely unchanged over a couple of 
years. 

A sample of the experimental data that we collected is shown in Figure 6. 
This table shows, for a few programs, the sizes of the verification condition gen- 
erated and of the associated proofs along with the time required for theorem 
proving and proof checking. The sizes are expressed in number of AST nodes 
while the timings are expressed in milliseconds. The measurements were per- 
formed on a DEC Alpha with a 21064 processor running at 175MHz. Notice 
that in these experiments proof checking is about an order of magnitude faster 
than theorem proving. 

At this point we would like to point out that our imperative implementation 
of the Nelson-Oppen prover appears to be much faster than the functional im- 
plementation of Boulton in the HOT prover, as described in [Bou93,Bou95]. We 
ran Touchstone on the 11 examples shown on page 94 in [Bou93]. All of these 
examples are very small ranging from 9 to 43 AST nodes. While Boulton ran the 
measurements on a Sparcstation 2 with a 40Mhz processor we ran them on an 
Alpha with a 175Mhz. But even after multiplying the Touchstone running times 
by a factor of 5 to compensate for this difference it still results that Touchstone 
is faster by a factor ranging from 5 to 80 on these examples. 

There are several reasons behind the better performance in Touchstone. One 
of them is that Touchstone handles only a small fragment of first-order logic and 
can thus use a very fast goal-directed proof procedure. HOL extended with the 
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Program 


Lines 


VC size 
(AST nodes) 


Proving time 
(ms) 


Proof size 
(AST nodes) 


Check time 
(ms) 


bcopy 


16 


82 


25 


64 


4 


edge 


88 


224 


143 


528 


15 


kmp 


67 


483 


108 


344 


9 


qsort 


142 


1444 


127 


1770 


16 


sharpen 


153 


420 


257 


477 


23 


simplex 


303 


7055 


1272 


3912 


120 


unpack 


259 


5759 


1912 


1750 


92 



Fig. 6. Experimental data collected using the Touchstone prover in the context 
of generating PCC for type safety. 



Nelson-Oppen cooperative decision procedure on the other hand uses a more 
general proof procedure based on conversion to disjunctive normal form. An- 
other possible reason for the disparity in the performance is that while HOL 
uses a satisfiability procedure for arithmetic based on Fourier-Motzkin variable 
elimination, Touchstone uses an efficient implementation of Simplex. Boulton ex- 
plains that the Fourier-Motzkin elimination was chosen in HOL because of the 
simplicity of proof generation. We show that even an efficient version of Simplex 
can be made fully expansive, although almost surely at a larger programming 
cost. 

Finally, we suspect that another reason that makes Touchstone faster than 
Boulton’s implementation of the Nelson-Oppen strategy is the use of an im- 
perative implementation style. By making very judicious use of memory and 
we observe very little garbage collection during proving. In contrast, Boulton’s 
implementation uses a functional programming style leading to very elegant im- 
plementation of a crucial part of the prover: the undo mechanism. Instead our 
implementation of undo is quite a bit more complex and is responsible for many 
of the bugs that we discovered. In quite a few cases we forgot to undo certain 
changes thus leading to unsoundness. This did not turn out to be a big problem 
because the proof checking mechanisms quickly pointed out our errors. 

Other times the undo procedure mistakenly removed too many assertions to 
some situations in which the prover was not able to prove predicates that it was 
intended to prove. We were helped in this situation by the fact that we were using 
Touchstone essentially to verify that a number of optimizations performed by 
our compiler preserve type safety. By design of both the compiler and the prover, 
every failed proof attempt points to either a compilation bug or a completeness 
bug in the prover. This is how we found a very large number of bugs in the 
compiler and some in the prover. 
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5 Conclusion 

We describe in this paper an implementation of a Nelson-Oppen theorem prover 
enhanced with the ability to generate easily-checkable proof objects for all the 
predicates it proves. Our implementation, just like the one described by Nelson, 
was designed with efficiency in mind in order to handle verification condition of 
non trivial programs. This led us to use more complex algorithms and imple- 
mentation techniques than a related functional implementation in the context 
of the HOL theorem prover. 

The added complexity of our design seem to pay off in terms of efficiency. 
But it also led us to making many subtle design and programming errors that 
threatened both the soundness and the completeness of our prover. Fortunately, 
soundness was never in real danger because of Touchstone’s proof generating 
ability, which enables us to use a simple proof checker to validate the correctness 
of each run. 

As a general conclusion, we feel that the benefits of proof-generation in theo- 
rem provers clearly outweigh the additional cost of designing and implementing 
proof-generating decision procedures. 
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Abstract. A program scheme looks like a recursive function definition, 
except that it has free variables ‘on the right hand side’. As is well-known, 
equalities between schemes can capture powerful program transforma- 
tions, e.g., translation to tail-recursive form. In this paper, we present a 
simple and general way to define program schemes, based on a partic- 
ular form of the wellfounded recursion theorem. Each program scheme 
specifies a schematic induction theorem, which is automatically derived 
by formal proof from the wellfounded induction theorem. We present 
a few examples of how formal program transformations are expressed 
and proved in our approach. The mechanization reported here has been 
incorporated into both the HOL and Isabelle/HOL systems. 



Program schemes form the foundation of an interesting class of program 
development methodologies which advocate the incremental instantiation of ab- 
stract programs, preserving important properties all the while, until a suitable 
concrete program results. There has been a great deal of work on program trans- 
formation, for background see [6,18,33,23,26]. 

Although program transformation theories are being applied a lot informally, 
work on program transformation in mechanized proof assistants is not as abun- 
dant, in spite of the evident interest in using such systems as platforms for 
program development and transformation. One reason for this may be that, cur- 
rently, such environments {e.g., [13,25,22,3]) tend to be based on logics of total 
functions and it is not clear how a program scheme can be regarded as a total 
function, since many schemes allow instantiations such that the resulting func- 
tion is not total. In spite of this, we will describe a simple and general technique 
by which schemes may be defined such that totality is enforced. 

1 Formal Basis 

We work in a higher order logic commonly called HOL [13]; a description of the 
logic may be found in the Appendix. We adopt the common approach of using 
the native functions of the logic to represent programs; recursive programs are 
modelled with the use of a wellfounded recursion theorem. There are several 
equivalent definitions of wellfoundedness [28] ; the following asserts that the rela- 
tion R : a a bool is wellfounded iff every non-empty set has an i?-minimal 
element. 
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Definition 1 (Wellfoundedness). 

WF(i?) = VP. {3w. P w) D 3min. P min A\/b. Rb min D ^P b. 

From this definition, the following general induction and recursion theorems 
can be proved (the interested reader can find details in [30]): 

Theorem 2 (Wellfounded Induction). 

WF(P) D (Vs. (Vy. Ry X D P y) D P x) D Vs. P s. 



Theorem 3 (Wellfounded Recursion). 

V/ RM.{f = WFREC P M) D WF(P) D Vs. /(s) = M (/ 1 P, s) s. 

WFREC : (a ^ a ^ bool) ^ ((a ^ fi) ^ {a ^ /?)) ^ a ^ P can be thought of 
as a ‘controlled’ fixpoint operator; since it is only used to prove Theorem 3, we 
omit its quite obfuscatory definition. Also used in the statement of Theorem 3 
is a ternary operator that restricts a function to a certain set of values. ^ 

Definition 4 (Restriction). 

(/ 1 P, y) = As. if P s y then / s else Arb. 



Theorem 5. P s y ^{f\R,y)x = fx. 

2 The Technique 

We shall present our approach with the hand derivation of an example; the 
automation of the technique will be taken up in Section 3. Consider the following 
description of the ‘while’ construct familiar from imperative programming: 

While s = if P s then While (C s) else s. 

This is a syntactic specification of a class of functions determined by the 
parameters B and C . To start the derivation, the description is translated into 
a functional: 



XWhile s. if B s then While {C s) else s. (1) 

^ In set theory, or logics of partial functions, function restriction may result in a partial 
function. In a logic of total functions, such as HOL, a restriction of a function is still 
a total function, giving a fixed but arbitrary value when applied outside of the 
restriction. 
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Instantiating M in the recursion theorem with (1) yields 
'WF R, 

f = WFREC R {XWhile s. if B s then While {C s) else s) 
h 

\/x. f{x) = it B X then (/ 1 i?, x) {C x) else x. 

By assuming Vs. B s D R (C s) s (we discuss the origin of this assumption 
in Section 3), it is possible to derive 

'WF R, Vs. BsD R (C s) s, 

f = WFREC R {XWhile s. if B s then While {C s) else s) 
h 

f X = if B X then / {C x) else x. 

The assumptions WF R and Vs. B s D R {C s) s are the ‘termination condi- 
tions’ of (3). Now we apply the Principle of Constant Definition to define While. 
This is the central step in our method. The indefinite description operator {e) is 
applied to choose a wellfounded relation R meeting the termination conditions. 
Notice also that the distinction between parameters {B and C) and arguments 
(s) is supported by the different binding sites in the definition: parameters are ar- 
guments to the definition itself, while the original argument s is a bound variable 
in the functional. 

While = ABC. WFREC (£i?. WF B A Vs. B s D B (C s) s) , . 

{XWhile s. if B s then While {C s) else s). 

Eliminating (4) from the hypotheses of (3) yields 

■ WF {eR. WF B A Vs. B s D B (C s) s), 

Vs. BsD {eR. WF R AVs. B s D R {C s) s) (C s) s 
h 

While B C s = if B s then While B C {C s) else s. 

Finally, assuming WF(B) and Vs. B s D R {C s) s and then applying the 
Select Axiom allows the conclusion 

[WF B,Vs. B sD R{C s) s] 

h (5) 

While B C s = if B s then While B C {C s) else s. 

Remark 6. The derived equation (5) looks like a normal higher-order function; 
however, had we tried to define While as a higher order function in the stan- 
dard manner, i.e., with no parameters, then B,C and s would be treated as 
arguments — and thus bound in the functional — and the termination conditions 
would be equivalent to the proposition 

3B. WF(B) AVBC s.R (B, C, C s) (B, C, s), 
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which is not provable since C could be taken to be the identity function, but 
there is no wellfounded relation R such that Rx x. 



Remark 7. Our treatment of parameters is not specific to wellfounded recursion: 
it works for any fixpoint operator. In particular, for any fix satisfying the well- 
known equation fix(M) = M (fix(M)), it is merely a common subexpression 
elimination to get Vg. {g = fix(M)) D Vx. g x = M g x. By abstracting free 
variables Vi, . ■ ■ ,Vk of M, this can be transformed to 

I" V 5 . {gVi...Vk = fix(M)) D^x. gVi . . .Vk X = M {gVi . . .Vk) X. 

With hindsight, the treatment of parameters in inductive definition packages 
such as those reported in [21,24,15] can be seen as concrete applications of this 
theorem. 



2.1 Induction 



It is well known that the wellfounded relation used to prove termination for a 
function can also be used to derive an induction theorem, in which the induction 
predicate is assumed to hold for the arguments to recursive calls. For ML-style 
pattern-matching recursion equations of the form 



fipati) = rhsi[f{aii),...,f{aiki)] 



fipatn) = rhsn[f{ani), ■■■, /(a„fc„)j, 



( 6 ) 



an induction theorem of the following form (where r{aij) is the context of re- 
cursive call / (aij)) can be derived by formal proof from Theorem 2: 



/ 


( (V(T(aii) D P Oil)) A ^ 


\ 


V 

V 


: A 

A P aifcj) ) 


D P{pati) 



/ 


/(V(P(a„i) DPa„i)) A\ 




V 

V 


: A 

(V(P(a„fc„) D P QnkJ) 


D P{patn) 



The assumptions to this theorem will be the termination conditions of /, as 
explained in [31], where the automatic derivation of such induction theorems is 
described. An earlier treatment of the derivation of induction for functions in a 
simpler object language is described in [5]. 

It might seem to be problematic to derive induction for program schemes 
since the termination relation is not known; however, an appropriate induction 
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theorem can still be derived: all that is required is to assume that a suitable 
termination relation exists. We demonstrate the idea by deriving the following 
induction theorem for the While function: 



WF R, 

Ws. B s D R {C s) s 



h VP. (Vs. {B sD P {C s)) D P s) D Vu. P V. 



(7) 



The derivation begins by assuming the antecedent of (7) and the termination 
conditions of the definition. 



1 . 

2 . 

3. 

4. 

5. 

6 . 

7. 

8 . 

9. 

10 . 
11 . 



Vs. (P(s) Z} P{C s))Z) P s 
Vs. P(s) D R{C s) s 
[2] h P(s) D P (C s) s 
Vy. R y s D P s 
[4] h P (C s) s D P (C s) 

[2, B s]^ R{C s) s 
[4, 2, P s] h P (C s) 

[4, 2] h P s D P (C s) 

[1.4.2] hPs 

[1.2] h (Vy. Ry sD P s)dP s 

[1, 2] h Vs. (Vy. PysDPs)DPs 



Assume 
Assume 
V-elim(2) 
Assume 
V-elim(4) 
Undisch(3) 
D -elim (5) (6) 
D -intro (7) 
D -elim (1) (8) 
D -intro (9) 
V-intro (10) 



In step 11, the antecedent of the wellfounded induction theorem (Theorem 2) 
has been derived, and a few further obvious steps deliver (7), as desired. 

Remark 8. If the semantics of a Hoare triple {P} C {Q} are defined by 



Hoare P C Q = 'is. P s D Q {C s) 

then the following While rule for total correctness has an easy^ proof by induction 
with (7): 



WF P, I Hoare (As. P s A B s) C P D 

Vs. B s D R {C s) s Hoare P (While P C) (As. P s A ^P s). 

3 Automation 

A useful level of support for deriving program schemes can be supplied by gen- 
eralizing and automating the steps taken in the While example. The particular 
interface we have implemented takes as input recursion equations of the form 
given in (6) and performs the following steps: 

1 . Translates the equations into a functional P, using a pattern-matching trans- 
lation based on those used in functional programming language implemen- 
tations [2,20]. 

^ The proof takes four tactic applications in a current version [14] of the Hol98 proof 
assistant. 
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2. Instantiates M in the recursion theorem with T . 

3. Extracts termination conditions TC\{R), . . .TCk{R) from the results of step 
2, where i? is a variable representing the wellfounded relation. 

4. Computes the free variables Pi ... of P, then defines the constant denoting 
the desired function: 

/ = APi . ..Pi. WFREC (£P.WF(P) A TCi(P) A ... A TCk{R)) T. 

Two things are important here: (1) using the description operator to choose 
a suitable wellfounded relation meeting the results of step 3; and (2) making 
sure to separate parameters from arguments in the definition. 

5. Eliminates the result of step 4 from the hypotheses of the result of step 2 
(the instantiated recursion theorem). 

6. Assumes each of \Nf{R),TCi{R), . . .TCk{R) and then eliminates the de- 
scription operator terms, via application of the Select Axiom. Now the de- 
sired termination constraints have been derived. 

7. Derives the induction theorem from the termination conditions. 

8. Returns the recursion equations and the induction theorem. 

Fortunately, the algorithms of [30,31] generalize naturally to support steps 
1 to 8. The key to automation is step 3, in which termination conditions are 
automatically extracted. This is accomplished by use of a special contextual 
rewriter, which attempts to rewrite the instantiated recursion theorem (coming 
from step 2) with the conditional rewrite rule for function restriction (Theorem 
5) . In searching for matches for this rule, the rewriter is essentially searching for 
every recursive call site in the original equations. The rewriter uses its stock of 
contextual rules to gather and discard context P as it makes its search; when a 
recursive call site {f\R,pati) (aij) is found (in context P{aij)), the termination 
condition P{aij) D R aij pati is captured by performing a small proof which 
stores the termination condition on the assumptions. After the rewriting process 
terminates, one is left with a theorem, the conclusion of which is the desired 
recursion equations, and the assumptions of which are the termination conditions 
(from which the induction theorem can be derived). An important point about 
these manipulations is that they all take place by deductive steps in the object 
logic, so the results are sound. Detailed descriptions of the algorithms, including 
their extension to mutual recursion, nested recursion, and higher order recursion 
can be found in [32]. 

To extend these algorithms to program schemes is particularly simple: all 
that need be done is to take care never to quantify scheme variables in any of 
the derivations. If there are no scheme variables, the algorithms perform exactly 
as in [30,31], so schemes are a smooth extension to the existing apparatus. 

4 Formal Program Transformations 

Program schemes are helpful for giving suitably abstract descriptions of classes 
of functions. A further application of schemes comes from using them as a basis 
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for program transformation: instances of schemes may be identified, provided 
applicability conditions are satisfied. There are various ways of representing pro- 
gram transformations formally; we choose to represent them simply as theorems 
(specifically, as constrained recursion equations). Proving a program transfor- 
mation typically involves an application of the induction theorem for one of the 
program schemes being equated. 

Example 9. The following scheme expresses a class of linear recursive programs: 

linRec(a;) = if Atomic x then A x else Join (linRec (Best x)) {D x). 

Under certain conditions, instances of linRec are equal to corresponding instances 
of the following tail-recursive scheme, which uses an accumulating parameter: 

accRec(a;, u) = if Atomic x then Join {A x) u 

else accRec {Best x, Join {B x) u). 

Intuitively, the recursive calls of linRec must get ‘stacked up’ somehow, wait- 
ing for deeper recursive calls to return. In contrast, calls to accRec need not be 
stacked. If the combination function Join is associative, then the implicit brack- 
eting of the stacked recursive calls can be replaced with a single data value that 
gets modified and passed at each recursive call. We now formalize this intuition. 
The result of defining I i n Rec is (we omit the induction theorem) : 

'WF R, 

Vx. -^Atomic X D R (Best x) x 

h 

linRec B Best Join A Atomic x = 
if Atomic x then A x 

else Join (linRec B Best Join A Atomic {Best a;)) {B x), 

and that for accRec is (we conjoin the induction theorem to the recursion equa- 
tion): 

'WF R, 

Vx. ^Atomic X D R {Best x) x 

h 

(accRec B Best A Join Atomic {x, u) = 
if Atomic x then Join {A x) u 

else accRec B Best A Join Atomic {Best x, Join {B x) u)) 

A 

VP. (Va; u. {^Atomic xD P{Best x, Join {B x) u)) D P{x, u)) dVu vi. P{v, ui). 
The formal program transformation is then captured in the following theorem: 
'WFP, 

Va;. ^Atomic x D R {Best x) x, 

Vp q r. Join p {Join q r) = Join {Join p q) r 

h 

Va; u. Join (linRec B Best Join A Atomic x) u 
accRec B Best A Join Atomic {x, u). 
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Proof. Apply the induction theorem for accRec, then expand the definitions of 
linRec and accRec. □ 



Example 1 0. The following scheme for binary recursion uses the parameters Left 
and Right to break the input into two parts on which to recurse: 

binRec(a;) = if Atomic x then A x 

else Join (binRec {Left x)) (binRec {Right x)). 

The result of the definition is (omitting the induction theorem) 

'WF R, 

Vx. ^Atomic X D R {Right x) x, 

Vx. ^Atomic X D R {Left x) x 

h 

binRec Right Left Join A Atomic x = 
if Atomic x then A x 

else Join (binRec Right Left Join A Atomic {Left x)) 

(binRec Right Left Join A Atomic {Right x)). 

The example comes from Wand [33] , who used paper and pencil, and has been 
treated in PVS [29]. In his development, Wand was interested in explaining how 
continuations give the programmer a representation of the runtime stack, and 
thus can act as a bridge in the transformation of non-tail-recursive functions to 
tail recursive ones. In our development, we will avoid the continuation-passing 
intermediate representation (although it is simple for us to handle) and transform 
to tail recursion in one step. 

Now we present a general tail recursion scheme for lists. In the definition, the 
parameter Dest : a ^ a list breaks the head h of the work list h :: t into a list of 
new work, which it prepends to t before continuing; hence, the tailRec scheme is 
quite general because the argument to the second tail call may increase in length 
by any finite amount. (Wand and Shankar only consider tail recursions in which 
the Dest parameter can produce two new pieces of work.) 

tailRec ([], x) = v 

tailRec {h :: t, v) = if Atomic h then tailRec {t, Join v {A h)) 
else tailRec {Dest h ® t, v). 

The result of this definition is (including the induction theorem) 

'\NF R, 

Vx t h. ^Atomic hD R {Dest h ® t, v) {h :: t, x), 

Vx t h. Atomic h D R {t, Join v {A h)) {h :: t, v) 
h 
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(tailRec Dest A Join Atomic ([], w) = t>) A 
(tailRec Dest A Join Atomic {h :: v) = 

if Atomic h 

then tailRec Dest A Join Atomic (t, Join v {A h)) 
else tailRec Dest A Join Atomic {Dest h ® t, w)) 

A 

/WP. {Wv. P {[],v)) A \ 

{Vh t V. -^Atomic hD P {Dest h®t,v) A 

Atomic h D P {t, Join v {A h)) D P {h :: t, u)) 
y D Vw Vi. P {v, Wi). J 

We intend to prove an equivalence between binRec and tailRec but the transfor- 
mation seems to require the termination constraints for both binRec and tailRec 
to be satisfied. However, a bit of thought reveals that a useful fact about finite 
multisets can simplify matters, by allowing one constraint to be expressed in 
terms of the other. 

Definition 11 (msetPred). Let m be a finite multiset and R : a ^ a ^ bool 
a relation on elements of m. The relation msetPred R is built by removing an 
X from m and replacing it with a finite multiset of elements, each of which is 
R-smaller than x. 



Theorem 12. WF(i?) D WF(msetPred R) 

Proof. The classic(al) proof can be found in [8]; a recent constructive proof is 
described in [27, Chapter II]. □ 

Now we show how the termination condition of tailRec can be reduced to the 
(simpler) one of binRec: 

WF R A {fih y. -^Atomic h A mem y {Dest h) D R y h) 

D 

3R'. WF R' A 

{Wh t V. ^Atomic hh R' {Dest h® t, v) {h :: t, w)) A 
{\/h t V. Atomic hD R' {t, Join v {A h)) {h :: t, v)) 

Proof. Assume WF R and Vh y. -^Atomic h A mem y {Dest h) Z) R y h). R' 
is a relation on pairs. The witness for R' operates over the first projection of 
the pair, i.e., over lists, and maps a list into a multiset of the list elements. 
Since R is wellfounded, msetPred over the multiset is wellfounded and thus the 
witness is wellfounded. The remaining two conjuncts are both true, the first by 
assumption and the definition of msetPred, and the second by the definition of 
msetPred, since no elements are being put back into the multiset. □ 

With this reduction, one can state and prove the following general theorem 
relating binary recursion and tail recursion. The essential insight is that the work 
list I of tailRec represents a linearization of the binary tree of calls of binRec. Thus 
going from left to right through the work list, invoking binRec and accumulating 
the results, should deliver the same answer as executing tailRec on the work list. 
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We formalize this left-to-right pass by the auxiliary function rev_itlist.^ Note how 
the Dest parameter of tailRec has been specialized with Xx. [Left x, Right a;]. 

'\NF R, 

\/x. -^Atomic X D R {Left x) x A R {Right x) x, 

Vp q r. Join {Join p q) r = Join p {Join q r) 

h 

v; Vo. 

rev_itlist(Atr v. Join v (binRec Right Left Join A Atomic tr)) I vq 

tailRec(Aa;.[Le/i( x, Right a;]) A Join Atomic {l,vo) 

Proof. Induct with the induction theorem for tailRec. The base case is straight- 
forward; the step case is also essentially trivial, since it only involves using the 
induction hypotheses and rewriting with the definitions of rev_itlist, tailRec, and 
binRec. □ 

Finally, the desired program transformation 
'\NF R, 

\/x. ^Atomic X D R {Left x) x A R {Right x) x, 

Vp q r. Join {Join p q) r = Join p {Join q r) 

h 

Va; Vq. 

Join Vo (binRec Right Left Join A Atomic x) 

tailRec (Aa;. [Left x, Right a:]) A Join Atomic ([a;],uo) 

can be obtained by instantiating the work list I to comprise the initial item of 
work [a;], and then reducing the definition of rev_itlist away. 



Example 13. Now we derive a program transformation originally presented by 
Bird [4], and later mechanized by Shankar [29]. Consider a datatype btree of 
binary trees with constructors 

LEAF : a btree 

NODE : a btree a a btree — > a btree. 

The so-called catamorphism (iterator) for this type is 
btreeRec LEAF v f = v 

btreeRec (NODE t\ M t 2 ) v f = f (btreeRec t\ v f) M (btreeRec t 2 v /). 

® rev_itlist, also known as fold I to functional programmers, is defined as 

revjtlist / [] = u 

revjtlist f {h :: t) V = revjtlist f t {f h v) . 
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Most mechanizations of higher order logic automate such definitions; however, 
the so-called anamorphism (or unfold, or co-recursor) for this type has not been 
straightforward to define in these systems. Understanding the following definition 
of unfold : Of — !■ /3 btree may be eased by considering it as operating over an 
abstract datatype a which supports operations More : a bool and Dest : a 
a* (3 * a. 

unfold a; = if More x 

then let (j/i, 6,2/2) = Dest x 
in 

NODE (unfold yi) b (unfold 2/2) 
else LEAF. 

The automatically computed constraints attached to the definition are the fol- 
lowing: 

'WF R, 

\/x 2/1 62/2- More x A ((2/1, 6, 2/2) = Dest x) D Ry2 x, 

\/x 2/1 62/2- More x A ((2/1, 6, 2/2) = Dest x) D Ryi x. 

Notice that the mechanization is not currently smart enough to know that the 
two termination conditions share the same context. After some trivial manipu- 
lation to join the two termination conditions, the induction theorem for unfold 
is the following (omitting the hypotheses): 

VP. (Vx. (V2/1 6 2/2 -More xf\{{yi,b, 2/2) = Dest x) D P 2/1AP 2/2) D P x) D Vu. P v. 

( 8 ) 

It is easy to generalize unfold to an arbitrary range type by replacing NODE and 
LEAF with parameters G and C: 

fuse a; = if More x 

then let (221,6,2/2) = Dest x 
in 

G (fuse 2/1) 6 (fuse 2/2) 

else G. 

The fusion theorem states that unfolding into a btree and then applying a struc- 
tural recursive function to the result is equivalent to interweaving unfolding steps 
with the steps taken in the structural recursion. Thus two recursive passes over 
the data can be replaced by one: 

'WFP, 

Vs 2/1 62/2- More x A ((2/1, 6, y 2 ) = Dest x) D R yi x A Ry2 x 

h 

Vs G G. btreeRec (unfold Dest More x) G G = fuse G Dest G More x. 

Proof. The proof is by induction using (8), followed by expanding the definitions 
of btreeRec, unfold, and fuse. □ 
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5 Related Work 

The paper by Huet and Lang [18] is an important early milestone in the field 
of program transformation. They worked in the LCF system, using fixpoint in- 
duction to derive program transformations. Program schemes were not defined; 
instead, transformations were represented via applications of the Y combinator, 
z.e., had the form applicability conditions D Y IF = Y tj, for functionals T and 
Q. An influential aspect of the work was the use of second order matching to 
automate the application of program transformations. 

Work using PVS has represented program schemes and transformations by 
theories parameterized over the parameters of the scheme and having as proof 
obligations the applicability conditions of the transformation [29,9,10]. To apply 
the program transformation, the theory must be instantiated, and the corre- 
sponding concrete proof obligations proved. 

In our technique — in contrast — the parameters of a scheme are arguments to 
the defined constant, and the proof obligations are constraints on the recursion 
equations and the induction theorem. Thus, theorems are used to represent both 
program schemas and program transformations. Instantiating a program trans- 
formation in our setting merely requires one to instantiate type variables and/or 
free term variables in a theorem. It remains to be seen if one representation 
is preferable to the other. In other ways, however, our approach seems to offer 
improved functionality: 

1. Currently, our technique produces more general schemes, since termination 
conditions are phrased in terms of an arbitrary wellfounded relation, whereas 
termination relations in PVS are restricted to measure functions [22] . Simi- 
larly, a general induction theorem is automatically derived for each scheme 
in our setting, whereas the PVS user is limited to measure induction (or 
may alternatively derive a more general induction theorem ‘by hand’ from 
wellfounded induction) . 

2. Our technique is more convenient because it automatically generates — by 
deductive steps — termination conditions for schemes. Taking the example of 
unfold, one doesn’t have to ponder the right constraints in our setting: they 
are delivered as part of the returned definition. In contrast, the definition of 
unfold in [29] requires expert knowledge of the PVS type system in order to 
phrase the right constraints on the Dest parameter. Since the termination 
conditions of a scheme constrain any program transformation that mentions 
the scheme, our approach should also ease the correct formulation of program 
transformations . 

3. Our approach also works for mutually recursive schemes, which are not cur- 
rently available in PVS. 

The paper of Basin and Anderson [1] has much in common with our work: 
for example, both approaches represent schemes and transformations by HOL 
theorems (Basin and Anderson call these rules). Their work differs from ours 
by focusing on relations (they are interested in modelling logic programs) rather 
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than recursive functions. They present two techniques: in the first, program 
schemes are not defined; instead, transformations are derived by wellfounded 
induction on the arguments of the specified recursive relations (the relations 
themselves are left as variables) . In the second, a program scheme is represented 
by an inductively defined relation. The first approach suffers from lack of au- 
tomation: termination constraints are not synthesized and induction theorems 
are not automatically derived. In contrast, their second approach requires no 
mention of wellfoundedness, and induction is automatically derived by the in- 
ductive definition package of Isabelle/HOL. 

In [II], Farmer treats the definition of recursive functions in a logic of par- 
tial functions. Schematic functions are represented in a similar manner to our 
approach, but the automation issues we tackle have not been explored. 

In the context of language design, Lewis et al. [19] use schemes to imple- 
ment a degree of dynamic scoping in a statically scoped functional program- 
ming language. Their approach allows occurrences of a free variable, e.g., V, in 
the body of a program to be marked with special syntax, e.g., ?V. The pro- 
gram is then treated as being parameterized by all such variables. To instantiate 
V occurring in a program / by a ground value val, they employ a notation 
‘/ with ?V = val’. Although their work is phrased using operational semantics 
and ours is denotationally based, there are many similarities. 

Finally, our approach gives a higher-order and fully formal account of the 
steadfast transformation idea of Flener et al. [12]. In contrast to their work, 
we need give no soundness proof since our transformations are generated by 
deductive steps in a sound logic. 

6 Conclusions 

We have shown how a very simple technique allows a smooth treatment of pro- 
gram schemes, their induction theorems, and program transformations. Although 
the ideas are presented in the HOL logic, they are broadly applicable: the only 
notable requirements are a recursion theorem of the right form, a basic definition 
principle for introducing abbreviations, and an indefinite description operator. 
We have also sketched how higher levels of automation may be achieved, based 
on the automatic extraction of termination conditions by contextual rewriting. 
A few standard examples have been covered and, in some cases, generalized. 

We emphasize that transformations derived using our technique are sound. 
For any instantiation of the parameters of a scheme or transformation, the rules 
of deduction force the applicability constraints to be likewise instantiated, and 
those instantiations persist in the hypotheses until eliminated by deduction. An 
instantiated scheme or transformation with invalid constraints can of course be 
trivialized. 

Future work should focus on the difficult problems involved in automating the 
application of program transformations. One potential benefit of our practice of 
always deriving induction theorems may be that, if the scheme and the induction 
theorem are treated as a unit during instantiation, the instantiated induction 
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scheme will be available for reasoning about the instantiated program at each 
step in the instantiation chain. 

The schematic definition facility we have presented has been implemented 
via simple extensions to the TFL [32] package: as a result, program schemes as 
described in this paper have been available in the public releases of both the 
Hol98 and Isabelle/HOL systems since summer 1999. 
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Appendix 

The HOL logic is a typed higher-order predicate calculus [13], derived from 
Church’s Simple Theory of Types [7]. The HOL logic is classical and has a 
set theoretic semantics, in which types denote non-empty sets and the function 
space denotes total functions. Several mature mechanizations exist [14,16,25]. 
The HOL logic is built on the syntax of a lambda calculus with an ML-style 
polymorphic type system. The syntax is based on signatures for types (17) and 
terms {Sq). The type signature assigns arities to type operators, while the term 
signature delivers the types of constants. 

Definition 14 (HOL Types). The set of types is the least set closed under the 
following rules: 

type variable. There is a countable set of type variables, which are represented 
with Greek letters, e.g., a, j3, etc. 

compound type. If c in Q has arity n, and each of tyi, . . . is a type, then 
c{tyi, . ..tyn) is a type. 

A type constant is represented by a 0-ary compound type. A large collection 
of types can be definitionally constructed in HOL, building on the initial types 
found in 17: truth values (bool), function space (written a ^ /J), and an infinite 
set of individuals (ind). 

Terms are typed A-calculus expressions built with respect to Sq. When we 
wish to show that a term M has type r, the notation M : r is used. 
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Definition 15 (HOL Terms). The set of terms is the least set closed under 
the following rules: 

Variable, if v is a string and ty is a type built from fl then v : ty is a term. 
Constant, (c : ty) is a term if c : t is in Sq and ty is an instance of t, i.e., 
there exists a substitution for type variables 6, such that each element of the 
range of 9 is a type in fl and 9{t) = ty. 

Combination. (M N) is a term of type (3 if M is a term of type a ^ (3 and 
N is a term of type a. 

Abstraction. (Xv. M) is a term of type a ^ (3 if v is a variable of type a and 
M is a term of type (3. 

Initially, Ea contains constants denoting equality (=), implication (d), and 
an indefinite description operator (e). Types and terms form the basis of the 
prelogic, in which basic algorithmic manipulations on types and terms are de- 
fined: e.g., the free variables of a type or term, a-convertibility, substitution, and 
/3-conversion. For describing substitution, the notation [Mi i— > M 2 ] N is used to 
represent the term N where all free occurrences of M\ have been replaced by 
M 2 . Of course, M\ and M 2 must have the same type in this operation. During 
substitution, every binding occurrence of a variable in N that would capture a 
free variable in M 2 is renamed to avoid the capture taking place. 

Deductive system. In Figure I, a useful set of inference rules is outlined, along 
with the axioms of the HOL logic. The derivable theorems are just those that 
can be generated by using the axioms and inference rules of Figure 1. More 
parsimonious presentations of this deductive system can be found in [13] or 
Appendix A of [17]. 

A theorem with hypotheses Pi,...,Pk and conclusion Q (all of type bool) 
is written [Pi, . . . , Pfc] F Q. In the presentation of some rules, e.g., V-elim, the 
following idiom is used: P,P F Q. This denotes a theorem where P occurs as 
a hypothesis. A later reference to P then actually means P — {P}, i.e., had P 
already been among the elements of P, it would now be removed. 

Some rules, noted by use of the asterisk in Figure 1, have restrictions on their 
use or require special comment: 

— V-intro. The rule application fails if x occurs free in P. 

— 3-intro. The rule application fails if N does not occur free in P. Moreover, 
only some designated occurrences of N need be replaced by x. The details 
of how occurrences are designated vary from implementation to implemen- 
tation. 

— 3-elim. The rule application fails if the variable v occurs free in PUZ\U{P, Q}. 

— Abs. The rule application fails if v occurs free in P. 

— tyinst. A substitution 9 mapping type variables to types is applied to each 
hypothesis, and also to the conclusion. 

An important feature of the HOL logic is £ : (a ^ bool) ^ a, Hilbert’s 
indefinite description operator. A description term ex : t. P x is interpreted as 
follows: it delivers an arbitrary element e of type r such that P e holds. If there 
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D -intro 


PhQ 


Ph PdQ Ah P 


D -elim 




r-{P}hFDQ 


PuAhQ 




A-intro 


Fh P Ah Q 


Ph paq 


A-elim 




FuAh FAQ 


Ph P PhQ 




V-intro 


Fh P 


PihPyQ 


V-elim 




FhPyQ, FhQyP 


P2,PhM Pa,QhM 








/i U /2 U /a h M 




V-intro* 


Fh P 


Ph\/x. P 


V-elim 




Fh\/x. P 


Ph[x^ N]P 




3-intro* 


Fh P 


Ph3x. P 


3-elim* 




P h 3x. [A x]P 


A, [x ii]P h Q 








PuAhQ 




Assume 


Ph P 


h M = M 


Refl 


Sym 


Ph M = N 


Ph M = N, Ah N 


Trans 




rh N = M 


PU Ah M = P 




Comb 


Ph M ^N, Ah P^Q 


rh M = N 


Abs* 




PU Ah M P ^ N Q 


P h (Xv.M) = (Xv.N) 




tyinst* 


Ph M 


h {Xv.M)N =[v^ N]M 


/3-conv 




6{P) h 6{M) 






Bool 




h PV^P 




Eta 


h (\v. M v)=M 




Select 


h P 


X D P{sx. P x) 




Infinity 1 


- 3/:ind— >ind. (V®y. (/ x 


= f y) U {x = y)) A 3y.\/x. 


■~'(y = f x) 



Fig. 1. HOL deductive system 



is no object that P holds of, then ex : t. P x denotes an arbitrary element of r. 
This is summarized in the axiom h VP x. P x D P{ex. P x). 

Definition 16 (Arb). Arb = ez : a.T 

The definition of Arb uses the Hilbert choice operator to denote an arbitrary 
but fixed value, for each type r. Arb is fixed because T has no free variables; it 
is arbitrary because Au.T holds for every element of r. ^ 

F and T are the two constants of type bool denoting truth values in HOL. 
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One of the most influential methodological developments in verification has 
been the adoption of principles of definition as logical prophylaxis, and imple- 
mentations of HOL therefore tend to eschew the assertion of axioms. 

Definition 17 (Principle of Constant Definition). Given terms x:t and 
M:t in signature Sq, check that 

1. X is a variable and the name of x is not the name of a constant in Eq; 

2. T is a type in Eq; 

3. M is a term in Eq with no free variables; and 
4- Every type variable occurring in M occurs in r. 

If all these checks are passed, add a constant x: t to Eq and introduce an axiom 
hx = M. □ 

Thus invocation of the Principle of definition, for suitable c and M , introduces 
c as an abbreviation for M . It is shown in [13] to be a sound means of extending 
the HOL logic. The notation c = M is often used to show that a definition 
is being made. Derived definition principles, such as the one presented in this 
paper, reduce via deduction to application of the primitive Principle. 
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Abstract. We use the uniform framework of abstract congruence closure 
to study the congruence closure algorithms described by Nelson and Op- 
pen [9], Downey, Sethi and Tarjan [7] and Shostak [11]. The descriptions 
thus obtained abstract from certain implementation details while still 
allowing for comparison between these different algorithms. Experimen- 
tal results are presented to illustrate the relative efficiency and explain 
differences in performance of these three algorithms. The transition rules 
for computation of abstract congruence closure are obtained from rules 
for standard eompletion enhanced with an extension rule that enlarges a 
given signature by new constants. 



1 Introduction 

Algorithms to compute “congruence closure” have typically been described in 
terms of directed acyclic graphs (dags) representing a set of terms, and a union- 
find data structure storing an equivalence relation on the vertices of this graph. 
In this paper, we abstractly describe some of these algorithms while still main- 
taining the “sharing” and “efficiency” offered by the data structures. This is 
achieved through the concept of an abstract congruence closure, c.f. [2, 3]. 

A key idea of abstract congruence closure is the use of new constants as 
names for subterms which yields a concise and simplified term representation. 
Consequently, complicated term orderings are no longer necessary or even appli- 
cable. There usually is a trade-off between the simplicity of terms thus obtained 
and the loss of term structure. In this paper, we get a middle ground where 
we keep the term structure as much as possible while still using extensions to 
obtain a simplified term representation. The paper also illustrates the use of an 
extended signature as a formalism to model and subsequently reason about data 
structures like the term dags, which are based on the idea of structure sharing. 

In Section 2 we review the description of abstract congruence closure as a 
set of transition rules [2, 3]. The transition rules are derived from standard 
completion [1] enhanced with extension and suitably modified for the ground 
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case. Taking such an abstract view allows for a better understanding of the 
various graph-based congruence closure algorithms (Section 3), and also suggests 
new efficient procedures for constructing congruence closures (Section 4). 



Preliminaries 

Given a set S = of function symbols and constants-called a signature-the 

set of (ground) terms T(Z') over E is the smallest set containing Eq and such 
that /(ti, ■ ■ - ,tn) G T{E) whenever f G En and U G T{E). The index n of the 
set En to which a function symbol / belongs is called the arity of the symbol 
/. Elements of arity 0 are called constants. A symbol / G of arity k is also 
said to be a fc-ary function symbol. The symbols s, t, u, . . . are used to denote 
terms in T{E); ., function symbols. We write t[s] to indicate that a term t 

contains s as a subterm and (ambiguously) denote by t[u] the result of replacing 
a particular occurrence of s by u. A subterm of a term t is called proper if it is 
distinct from t. 

An equation is a pair of terms, written as s « t. The replacement or single- 
step rewrite relation^ — >£; induced by a set of ground (or variable- free) equations 
E is defined by: u[l] ->-e u[f] if, and only if, ^ « r is in E. If ^ is a binary 
relation, then <— denotes its inverse, its symmetric closure, its transitive 
closure and its reflexive-transitive closure. Thus, denotes the congruence 
relation^, which is the same as the equational theory when E is ground, induced 
by a set E of ground equations. Equations are often called rewrite rules, and a 
set E a rewrite system, if one is interested particularly in the rewrite relation 
— rather than the equational theory 

If if is a set of equations, we write if [s] to denote that the term s occurs as 
a subterm of some equation in E, and (ambiguously) use E[t] to denote the set 
of equations obtained by replacing an occurrence of s in if by t. 

A term t is in normal form with respect to a rewrite system R if there is 
no term t' such that t t' . We write s — i to indicate that t is a ii- 
normal form of s. A rewrite system R is said to be (ground) confluent if every 
(ground) terms t has at most one normal form, i.e., if there exist s, s' such that 
s t s', then, s o ^ s'. It is terminating if there exists no infinite 
sequence sq si ^r S 2 • • • of terms. Rewrite systems that are (ground) 
confluent and terminating are called (ground) eonvergent. 



2 Abstract Congruence Closure 

We first review the concept of an abstract congruence closure [2, 3]. Let if be a 
signature and A" be a set of constants disjoint from E. A D-rule (with respect 
to E and A") is a rewrite rule of the form t c where t is a term from the set 

^ There is no difference between the replacement relation and the rewrite relation in 
the ground case. 

^ A congruence relation is a reflexive, symmetric and transitive relation on terms that 
is also a replacement relation. 
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T(i7U K) — K and c is a constant in . A C-rule (with respect to K) is a rule 
c d, where c and d are constants in K. For example, if Eq = {a, 6, /}, and 
Eo = {a ^ 6, ffa « fb} then Dq = {a ^ cq, b ci, ffa C2, fb C3} is 
a set of -D-rules over Eq and Kq = {cq, ci, C2, C3}. Original equations in Eq can 
now be simplified using Dq to give Cq = {cq « ci , C2 « C3}. The set Dq U Cq may 
be viewed as an alternative representation of Eq over an extended signature. The 
equational theory presented by Dq U Cq is a conservative extension of the theory 
Eq. This reformulation of the equations Eq in terms of an extended signature is 
(implicitly) present in all congruence closure algorithms, see Section 3. 

A constant c in A" is said to represent a term f in T {E U K) (via the rewrite 
system R) if t c. A term t is represented by R if it is represented by some 
constant in K via R. For example, the constant C2 represents the term ffa via 
Dq. 

Definition 1. Let E be a signature and K be a set of constants disjoint from E. 
A ground rewrite system R = D U C of D-rules and C -rules over the signature 
E\J K is said to be an (abstract) congruence closure (with respect to E and K ) 
if (i) each constant c G K that is in normal form with respect to R, represents 
some term t gT{E) via R, and (ii) R is ground convergent. 

If E is a set of ground equations over T{E\J K) and in addition R is such 
that (Hi) for all terms s and t in T{E), s t if, and only if, s — o t, 
then R will be called an (abstract) congruence closure for E. 

Condition (i) essentially states that no superfluous constants are introduced; 
condition (ii) ensures that equivalent terms have the same representative; and 
condition (iii) implies that i? is a conservative extension of the equational theory 
induced by E over T{E). 

The rewrite system Rq = Dq U {cq ^ ci, C2 ^ C3} above is not a congruence 
closure for Eq, as it is not ground convergent. But we can transform Rq into a 
suitable rewrite system, using a completion-like process described in more detail 
below, to obtain a congruence closure 

Ri = {a ^ Cl, b^ Cl, fci^ C3, /C3 ^ C3, Co ^ Cl, C2 ^ C3}. 

Construction of Congruence Closures 

We next present a general method for construction of congruence closures. Our 
description is fairly abstract, in terms of transition rules that manipulate triples 
(K, E, R), where K is the set of constants that extend the original fixed signature 
E, E is the set of ground equations (over EU K) yet to be processed, and R is 
the set of C-rules and D-rules that have been derived so far. Triples represent 
states in the process of constructing a congruence closure. Construction starts 
from initial state (0, E, 0), where D is a given set of ground equations. 

® The definition of a D-rule is more general than the definition presented in [2, 3] as 
it allows for arbitrary non-constant terms on the left-hand side. 
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The transition rules can be derived from those for standard completion as 
described in [1], with some differences. In particular, (i) application of the transi- 
tion rules is guaranteed to terminate, and (ii) a convergent system is constructed 
over an extended signature. The transition rules do not require any reduction 
ordering^ on terms in 7~{S), but only only a simple ordering on terms in 
T{E U U)^ where U is an infinite set of constants from which new constants 
K <Z U are chosen. In particular, if we assume >~u is any ordering on the set U, 
then is defined as: c >- d if c >~u d and t>-c if t^c is a D-rule. In this paper, 
the set U = {cq, ci, C 2 , . . .}, and we will assume Ci )^u Cj iff z < j. 

A key transition rule introduces new constants as names for subterms. 

^ , . {K,E[t],R) 

Extension: 

(ATU{c},E[c],i?U{t^c}) 

where t ^ c is a D-rule, t is a term occurring in (some equation in) E, and 
E\JK. 

Following three rules are identical to the corresponding rules for standard 
completion. 



Simplification: 



{K,E[t],R[J {t c}) 
{K,E[c],R\J{t^ c}) 



where t occurs in some equation in E. 

It is fairly easy to see that by repeated application of extension and simpli- 
fication, any equation in E can be reduced to an equation that can be oriented 
by the ordering 



Orientation: 



{K U {c}, ELI {t ^ c}, R) 
{K U {c}, E, RL {t ^ c}) 



ift>-c. 

Trivial equations may be deleted. 



Deletion: 



(K,EL{t^t},R) 

(K,E,R) 



In the case of completion of ground equations, deduction steps can all be 
replaced by suitable simplification steps. In particular, most of the deduction 
steps can be described by collapse, and hence, the deduction rule considers only 
simple forms of overlap. 



Deduction: 



{K, E, RL {t ^ c, t ^ d}) 
{K, E L {c ss d}, RL {t ^ d}) 



^ An ordering is any irreflexive and transitive relation on terms. A reduction ordering 
is an ordering that is also a well-founded replacement relation. 

® Terms in 'T(E) are uncomparable by 
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In our case the usual side condition in the collapse rule, which refers to the 
encompassment ordering, can easily be stated in terms of the subterm relation. 



Collapse: 



{K, E,R[J {s[t] ^ d, t ^ c}) 
{K, E,R\J {s[c] ^ d, t ^ c}) 



if f is a proper subterm of s. 

As in standard completion the simplification of right-hand sides of rules in 
R by other rules is optional and not necessary for correctness. The right-hand 
side term in any rule in R is always a constant. 



Composition: 



{K, E, RLI {t ^ c, c ^ d}) 
{K, E, RLI {t ^ d, c — > c?}) 



We use the symbol h to denote the one-step transition relation on states 
induced by the above transition rules. A derivation is a sequence of states 
{Ko,Eo,Ro)^{Ki,Ei,Ri)^---. 



Example 1. Consider the set of equations Eq = {a ~ b, ffa « fb}. An ab- 
stract congruence closure for Eq can be derived from {Kq, Eq, Ro) = {^,Eo,^) 
as follows: 



i 


Constants Ki 


Equations Ei 


Rules Ri 


Transition Rule 


0 


0 


Eo 


0 




1 


{co} 


{co^b, ffa « fb} 


[a -> Col 


Ext 


2 


{co} 


{ffa « fb} 


{a^ Co, 6^ Col 


Ori 


3 


{co} 


iff Co « fco} 


{a^ Co, 6^ Col 


Sim^ 


4 


{co,Cl} 


if Cl « fco} 


Ro U if Co Cl} 


Ext 


5 


{co,Cl} 


ifci ~ Cl} 


Ro U if Co Cl} 


Sim 


6 


Ko 


{} 


Ro U {/ci ^ Cl} 


Ori 



The rewrite system Rq is the required congruence closure. 

The correctness of the transition rules presented here can be established in 
a way similar to the correctness of the transition rules for computing a congru- 
ence closure modulo associativity and commutativity [3]. The differences arise 
from the more general definition of D-rules, and the lack of any associative and 
commutative functions here. 

The set of transition rules presented above are sound in the following sense: if 
{Kq, Eq, Ro) h {Ki, El, Ri), then, for all terms s and t \nT{S\JKo), s ^EiuRi ^ 
if and only if s ^EqURo Additionally, let Kq be a finite set of constants 
(disjoint from S), Eq be a finite set of equations (over E U Kq), and Rq be 
a finite set of D-rules and C-rules such that for every C-rule c ^ d G Rq, 
we have c d. Then, any derivation starting from (Kq, Eq, Rq) is finite. If 
{Kq, Eq, Rq) b* {Km, Em, Rm) , then Rm is terminating. We call a state {K, E, R) 
final if no transition rule (except possibly composition) is applicable. 
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Theorem 1. Let S he a signature and K\ a finite set of constants disjoint from 
S. Let El be a finite set of equations over E\JKi and R\ a finite set of D-rules 
and C -rules such that for every c & Ki represents some term t G T'{E) via 
El U Ri, and c d for every C-rule c ^ d in Ri. Lf {Kn, En, Rn) is a final 
state such that (iCi, Ei, Ri) h* {Kn, En, Rn), then En = 0 and Rn is an abstract 
congruence closure for Ei U Ri (over E and Ki). 

3 Congruence Closure Strategies 

The literature abounds with various implementations of congruence closure algo- 
rithms. We next describe the algorithms in [7], [9] and [11] as specific variants of 
our general abstract description. That is, we provide a description of these algo- 
rithms (modulo some implementation details) using abstract congruence closure 
transition rules. 

Term directed acyclic graphs (dags) is a common data structure used to 
implement algorithms that work with terms over some signature — such as the 
congruence closure algorithm. In fact, many algorithms that have been described 
for congruence closure assume that the input is an equivalence relation on ver- 
tices of a given dag, and the desired output is an equivalence on the same dag 
that is defined by the congruence relation. 

Figure 1 illustrate how a given term dag is (abstractly) represented using D- 
rules. The solid lines represent suhterm edges, and the dashed lines represent a 
binary relation on the vertices. We have a H-rule corresponding to each vertex, 
and a C-rule for each dashed edge. Note that the C-rules corresponding to a 
conventional term dag representation are all of a special form /(ci , . . . , Ck) c, 
where f € E is a, fc-ary function symbol, and ci, . . . , Cfc, c are all new constants. 
Such rules will be called simple C-rules. The definition of C-rules given in Sec- 
tion 2 is more general, and allows for arbitrary terms on the left-hand sides. In 
a sense this corresponds to storing contexts, rather than just symbols from E, 
in each node (of the term dag) . This is an attempt to keep as much of the term 
structure information as possible and still get advantages offered by a simplified 
term representation via extensions. 

We need to specify a U set and an ordering on this set. Since elements of 
Lf serve only as names, we can choose Lf to be any countable set of symbols. An 
ordering y-jj need not be specified a-priori but can be defined on-the-fly as the 
derivation proceed. (The ordering has to be extended so that the irreflexivity 
and transitivity properties are preserved). 

Traditional congruence closure algorithms also employ other data structures 
such as the following: 

(i) Input dag: Starting from the state (0, Eq, 0), if we apply extension and sim- 
plification using strategy (Ext o Sim*)* and making sure we create only simple 
H-rules, we finally get to a state {Ki^ Ei^ Di) where all equations in Ei are of 
the form c~d, for c, d G Ki. The set Hi, then, represents the input dag and Ei 
represents the (input) equivalence on the vertices of this dag. Note that due to 
eager simplification, we obtain representation of a dag with maximum possible 
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-D-rules representing the term dag: 





a ^ Cl 


gcici C 2 


fClC 2 Cs 


b ^ C4 


hC4 —> Cs 


C^ Cs 


d —> C 7 


her —> Cs 


gcscs cg 


fcscg Cio 







C-rules representing the relation on ver- 
tices: 



Cl s 


- C5 


C2 - 


S Cg 


C3 S 


Cio 


C4 S 


S C7 


Cs s 


- Cs 


Cs - 


- Cs 



Fig. 1. A term dag and a relation on its vertices 



sharing. For example, if ifo = {a ~ b,ffa « fb}, then Ki = {00,01,02,03,04}, 
El = joo « oi, 03 « 04} and Ri = {a ^ cq, b ^ ci, fco ^ 02, /02 ^ 03, fci 
04}. 

(ii) Signature table: The signature table (indexed by vertices of the input dag) 
stores a signature^ for some or all vertices. Clearly, the signatures are fully left- 
reduced D-rules. 

(iii) Use table: The use table (also called predecessor list) is a mapping from the 
constant o to the set of all vertices whose signature contains o. This translates, 
in our presentation, to a method of indexing the set of D-rules. 

(iv) Union Find: The union-find data structure that maintains equivalence classes 
on the set of vertices is represented by the set of C rules. If we apply orientation 
and simplification to the state {Ki, Ei, Di) described above, using the strategy 
(Ori o Sim*)*, we obtain a state (Di, 0, Di U Ci). The set Ci is a representa- 
tion of the Union-Find structure capturing the input equivalence on vertices. 
Continuing with the same example, C\ would be the set (cq — > ci, C3 ^ C4}. 

We note that, D-rules serve a two-fold purpose: they represent the input term 
dag, and also a signature table. We shall also note that Composition is used 
only implicitly in the various algorithms via path-compression on the union-find 
structure. 



Shostak’s Method 

Shostak’s congruence closure procedure was first described using simple D-rules 
and C-rules by Kapur [8] . We show here that Shostak’s congruence closure proce- 
dure is a specific strategy over the general transition rules for abstract congruence 
closure presented here. 

Shostak’s congruence closure is dynamic: it can accept new equations after 
it has processed some equations, and can incrementally take care of the new 



The signature of a term /(ti, . . . , tk) is defined as /(ci, . . . , Ck) where d is the name 
of the equivalence class containing term ti. 
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equation. Its input state is (0, £^0, 0)- Shostaks procedure can be described (at a 
fairly abstract level) as: 

Shos = ((Sim* o Ext*)* o (Del U Ori) o (Col o Ded*)*)* 

which is implemented as (i) pick an equation s « t from the E-component, (ii) 
use simplification to normalize the term s to a term s' (iii) use extension to create 
simple E-rules for subterms of s' until s' reduces to a constant, say c, whence 
extension is no longer applicable. Perform steps (ii) and (iii) on the other term 
t as well to get a constant d. (iv) if c and d are identical then apply deletion 
(and continue with (i)), and if not, create a C-rule using orientation, (v) Once 
we have a new C-rule, perform all possible collapse step by this new rule, where 
each collapse step is followed by all the resulting deduction steps arising out of 
that collapse. The whole process is now repeated starting from step (i). 

Shostak’s procedure uses indexing based on the idea of the use{) list. This 
use{) based indexing is used to identify all possible collapse applications. 

If the E-component of the state is empty while attempting to apply step (i), 
Shostak’s procedure halts. It is fairly easy to observe that Shostak’s procedure 
halts in a final state. Hence, Theorem 1 establishes that the E-component of 
Shostak’s halting state contains a convergent system and is an abstract congru- 
ence closure. 

Example 2. We use the set Eq used in Example 1 of Section 2 to illustrate 
Shostak’s method. We show some of the important intermediate steps of a 
Shostak derivation. 



0 


Constants Ki 


Equations Ei 


Rules Ri 


Transition 






Eo 






E 


{co,Cl} 


{ffa « fb} 


{a ^ Co, b^ Cl, Co ^ Cl} 


[^IQgSJ 


a 


{co,Cl} 


iff Cl « m 




Sim 


s 


{co, ...,03} 


{C 3 « fb} 


R 2 U {/Ci ^ C2, /C2 ^ C3} 




4 


{co, • • .,€3} 


{C 3 « C2} 


R 3 




a 


{co, •••,03} 


0 


Ri U {c 3 ^ C2} 


Ori 



The Downey Sethi Tarjan Algorithm 

The Downey, Sethi and Tarjan [7] procedures assumes that the input is a dag 
and an equivalence relation on its vertices, which, in our language, means that 
the starting state for this procedures is (Ei,0,Ei U Ci), where D\ represents 
the input dag and C\ represents the initial equivalence. It can be succinctly 
abstracted as: 

DST = ((Col o (Ded U {e}))* o (Sim* o (Del U Ori))*)* 

where e is the null transition rule. This strategy is implemented as follows (i) if 
any collapse rule is applicable, it is applied and if, as a result any new deduction 
step is possible, it is done. This is repeated until no more collapse steps are 
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possible, (ii) if no collapse steps are possible, then each C-equation in the E- 
component is picked up sequentially, fully-simplified (simplification) and then 
either deleted (deletion) or oriented (orientation). 

Although the above description captures the essence of the Downey, Sethi and 
Tarjan procedure, a few implementation details need to be pointed out. Firstly, 
the Downey, Sethi and Tarjan procedure keeps the original dag (represented by 
Di) intact^, but changes signatures in a signature table. Hence, in the actual 
implementation described in [7], the (Col o (Ded U {e}))* strategy is applied 
by: (i) deleting all signatures that will be changed, i.e., deleting all D-rules 
which can be collapsed; (ii) computing new signatures using the original copy of 
the signatures stored in the form of the dag Di; and, finally, (iii) inserting the 
newly computed signatures into the signature table and checking for possible 
deduction steps. Our description achieves the same end result, but, by doing 
fewer inferences. 

Secondly, in the Downey, Sethi and Tarjan procedure, for efficiency, an equa- 
tion c « d is oriented to c ^ d if the c occurs fewer times than d in the signature 
table. This is done to minimize the number of collapse steps. Additionally, in- 
dexing based on the rtse() tables is used for efficiently implementing the specific 
strategy. 

Let (A'i,0,Di U Cl) h’ (df„,if„,D„ U C„) be a derivation using the DST 
strategy. Then, it is easily seen that the state (df„, if„, U C„) is a final state, 
and hence the set D„UC„ is convergent, and also an abstract congruence closure. 
We remark here that D„ holds the information that is contained in the signature 
table, and (7„ holds information in the union-find structure. The set (7„ is usually 
considered the output of the Downey, Sethi and Tarjan procedure. 

Example 3 . We illustrate the Downey-Sethi-Tarjan algorithm by using the same 
set of equations E^, used in Example 1 of Section 2. The start state is {Ki , 0, DiU 
Cl) where a: = {co, . . . , C4}, = {a ^ co, ci, fco C2, fc2 C3, fci 

C4}, and. Cl = {co ^ ci, 03^ C4}. 
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Consts Ki 


Eqns Ei 


Rules Ri 


Transition 
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Ki 


0 


D\ U C\ 




2 


Ki 


0 


{a ^ Co, Cl, fci C2, 

/C2 ^ C3, fci C4} U Cl 


Col 


3 


Ki 


{C2 « C4} 


R2 


Ded 


4 


Ki 


0 


Rs - {/ci ^ C2} U {C4 ^ C2} 


Ori 



Note that C4 « C2 was oriented in a way that no further collapses were needed 
thereafter. 

The Nelson Oppen Procedure 

The Nelson-Oppen procedure is not exactly a completion procedure and it does 
not generate a congruence closure in our sense. The initial state of the Nelson- 

^ We could make a copy of the original Di rules and not change them, while keeping 
a separate copy as the signatures. 
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Oppen procedure is given by the tuple {K\, Ei,Di), where D\ is the input dag, 
and El represents an equivalence on vertices of this dag. The sets Ki and Di 
remain unchanged in the Nelson-Oppen procedure. In particular, the inference 
rule used for deduction is different from the conventional deduction rule®. 



NODeduction: 



{K, E,DUC) 
{K,EU{cKi d},DUC) 



if there exist two D-rules /(ci, . . . , Ck) c, and, /(c?i, . . . , dk) ^ din the set D-, 
and, Ci o di, for i = 1 , . . . ,k. 

The Nelson-Oppen procedure can now be (at a certain abstract level) repre- 
sented as: 

NO = (Sim* o (Ori U Del) o NODed*)* 

which is applied in the following sense: (i) select a C-equation c ~ d from the 
if-component, (ii) simplify the terms c and d using simplification steps until 
the terms can’t be simplified any more, (iii) either delete, or orient the sim- 
plified (7-equation, (iv) apply the NODeduction rule until there are no more 
non-redundant applications of this rule, (v) if the if-component is empty, then 
we stop, otherwise continue with step (i). 

Certain details like the fact that newly added equations to the set E are 
chosen before the old ones in an application of orientation and indexing based 
on the Mse() table, are abstracted away in this description. 

Using the Nelson-Oppen strategy, assume we get a derivation {Ki, Ei, Di)- 
^NO En, D„U(7„). One consequence of using a non-standard deduction rule, 
NODeduction, is that the resulting set U (7„ = Di U (7„ need not necessarily 
be convergent, although the the rewrite relation T)„/(7„ [6] is convergent. 

Example 4 - Using the same set Eq as equations, we illustrate the Nelson-Oppen 
procedure. The initial state is given by (Ki, Ei, Di) where Ki = {cq, ci, C2, C3, 
C4}; El = {co « Cl, C3 « C4}; and, Di = {a ^ cq, b ci, fco C2, /c2 ^ 
C 3 j fci C4}. 



B 


Constants Ki 


Equations Ei 
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Ki 
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Di 
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Ki 


{C3 « C4} 


Di U {co ^ Cl} 


Ori 
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Ki 


{C2 « C4, C3 « C4} 


R2 


NODed 


B 


Ki 


{C3 « C4} 


i ?2 U {C2 ^ C4} 


Ori 


B 


Ki 




i ?4 U {C3 ^ C4} 


Ori 



Consider deciding the equality fa « ffb. Even though fa terms 

fa and ffb have distinct normal forms with respect to R 5 . But terms in the 
original term universe have identical normal forms. 

® This rule performs deduction modulo C-equations, i.e., we compute critical pairs 
between D-rules modulo the congruence induced by C-equations. Hence, the Nelson- 
Oppen procedure can be described as an extended eompletion [6] (or completion 
modulo C-equations) method over an extended signature. 
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4 Experimental Results 

We have implemented five congruence closure algorithms, including those pro- 
posed by Nelson and Oppen (NO) [9], Downey, Sethi and Tarjan (DST) [7], 
and Shostak [11], and two algorithms based on completion — one with an index- 
ing mechanism (IND) and the other without (COM). The implementations of 
the first three procedures are based on the representation of terms by directed 
acyclic graphs and the representation of equivalence classes by a union-find data 
structure. The completion procedure COM uses the following strategy: 

((Sim* o Ext*)* o (Del U Ori) o (Com o Col)* o Ded*)*. 

The indexed variant IND uses a slightly different strategy 

((Sim* o Ext*)* o (Del U Ori) o (Col o Com o Ded)*)*. 

Indexing in the case of completion refers to the use of suitable data structures 
to efficiently identify which D-rules contain specified constants. 

In a first set of experiments, we assume that the input is a set of equations 
presented as pairs of trees (representing terms) . We added a preprocessing step 
to the NO and DST algorithms to convert the given input terms into a dag 
and initialize the other required data-structures. The other three algorithms 
interleave construction of a dag with deduction steps. The published descriptions 
DST and NO do not address the construction of a dag. Our implementation 
maintains the list of terms that have been represented in the dag in a hash table 
and creates a new node for each term not yet represented. We present below a 
sample of our results to illustrate some of the differences between the various 
algorithms. 

The input set of equations E can be classified based on: (i) the size of the 
input and the number of equations, (ii) the number of equivalence classes on 
terms and subterms of E, and, (iii) the size of the use lists. The first set of 
examples are relatively simple and developed by hand to highlight strengths and 
weaknesses of the various algorithms. Example (a)® contains five equations that 
induce a single equivalence class. Example (6) is the same as (a), except that it 
contains five copies of all the equations. Example (c)^*^ requires slightly larger 
use lists. Finally, example (d)^^ consists of equations that are oriented in the 
“wrong” way. 

In Table 1 we compare the different algorithms by their total running time, 
including the preprocessing time. The times shown are the averages of several 
runs on a Sun Ultra workstation under similar load conditions. The time was 
computed using the gettimeofday system call. 

® The equation set is {/^(a) « a, f^°{a) « /^®(6), 6 « /®(6), a « /®(a), /®(6) « 6}. 

The equation set is {g{a,a,b) « f{a,b), gabb ~ fba, gaab ~ gbaa, gbab ~ 
gabb, gbba « gbab, gaaa « faa, a oi c, c~d, d « e, b m cl, cl « dl, dl « el}. 

The set is {<?(/*(a), h^'^(b)) « g(a, b),i = {1, ■ ■ ■ , 25}, h'^'^(b) « 6, 6 « h^^(b), h(b) « 
cO, cO « cl, cl « c2, c2 « c3, c3 « c4, c4 « a, a « f(a)}. 
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|[^] 


34 


■PH 


2 


10.556 


22.488 


7.275 


12.077 


4.416 



Table 1. Total running time (in milliseconds) for Examples (a) (d). Eqns 

refers to the number of equations; Vert to the number of vertices in the initial 
dag; and Class to the number of equivalence classes induced on the dag. 



Table 2 contains similar comparisons for considerably larger examples con- 
sisting of randomly generated equations over a specified signature. Again we 
show total running time, including preprocessing time^^. 
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290 


1.452 


3.670 


0.392 


0.374 



Table 2. Total running time (in seconds) for randomly generated equations. The 
columns Si denote the number of function symbols of arity i in the signature 
and d denotes the maximum term depth. 



In Table 3 we show the time for computing a congruence closure assuming 
terms are already represented by a dag. In other words, we do not include the 
time it takes to create a dag. Note that we include no comparison with Shostak’s 
method, as the dynamic construction of a dag from given term equations is inher- 
ent in this procedure. However, a comparison with a suitable strategy (in which 
all extension steps are applied before any deduction steps) of IND is possible. 
We denote by IND* indexed completion based on a strategy that first constructs 
a dag. The examples are the same as in Table 2. 

Several observations can be drawn from these results. First, the Nelson- 
Oppen procedure NO is competitive only when few deduction steps are per- 
formed and thus the number of equivalence classes is large. This is because it 
uses a non-standard deduction rule, which forces the procedure to unnecessarily 
repeat the same deductions many times over in a single execution. Not sur- 
prisingly, straight-forward completion without indexing is also inefficient when 

Times for COM are not included as indexing is indispensable for larger examples. 
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Table 3. Running time (in seconds) when input is in a dag form. 



many deduction steps are necessary. Indexing is of course a standard technique 
employed in all practical implementations of completion. 

The running time of the DST procedure critically depends on the size of the 
hash table that contains the signatures of all vertices. If the hash table size is 
large, enough potential deductions can be detected in (almost) constant time. If 
the hash table size is reduced, to say 100, then the running time increased by a 
factor of up to 50. A hash table with 1000 entries was sufficient for our examples 
(which contained fewer than 10000 vertices). Larger tables did not improve the 
running times. 

Indexed Completion, DST and Shostak’s method are roughly comparable 
in performance, though Shostak’s algorithm has some drawbacks. For instance, 
equations are always oriented from left to right. In contrast. Indexed Completion 
always orients equations in a way so as to minimize the number of applications of 
the collapse rule, an idea that is implicit in Downey, Sethi and Tarjan’s algorithm. 
Example (6) illustrates this fact. More crucially, the manipulation of the use lists 
in Shostak’s method is done in a convoluted manner due to which redundant 
inferences may be done when searching for the correct non-redundant ones^^. 
As a consequence, Shostak’s algorithm performs poorly on instances where use 
lists are large and deduction steps are many such as in Examples (c), 4 and 5. 

Finally, we note that the indexing used in our implementation of completion 
is simple — with every constant c we associate a list of D-rules that contain c as a 
subterm. On the other hand DST maintains at least two different ways of indexing 
the signatures, which makes it more efficient when the examples are large and 
deduction steps are plenty. On small examples, the overhead to maintain the 
data structures dominates. This also suggests that the use of more sophisticated 
indexing schemes for indexed completion might improve its performance. 



5 Related Work and Conclusion 

Kapur [8] considered the problem of casting Shostak’s congruence closure [11] 
algorithm in the framework of ground completion on rewrite rules. Our work has 
been motivated by the goal of formalizing not just one, but several congruence 
closure algorithms, so as to be able to better compare and analyze them. 

The description in Section 3 accurately reflects the logical aspects of Shostak’s al- 
gorithm, but does not provide details on data structures like the use lists. 
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We suggest that, abstractly, congruence closure can be defined as a ground 
convergent system; and that this definition does not restrict the applicability 
of congruence closure. The rule-based abstract description of the logical as- 
pects of the various published congruence closure algorithms leads to a better 
understanding of these methods. It explains the observed behaviour of imple- 
mentations and also allows one to identify weaknesses in specific algorithms. 
Additionally, using the abstract rules, we can also get efficient implementation 
of completion based congruence closure procedure — one can effectively utilize 
the theory of redundancy to figure out and eliminate inferences which are not 
necessary, and moreover also use knowledge about efficient indexing mechanisms. 

The concept of an abstract congruence closure is also relevant for describing 
applications that use congruence closure algorithms. Some of these applications 
include efficient normalization by rewrite systems [4, 2], computing a complete 
set of rigid if-unifiers [13], and combination of decision procedures [11]. The 
notion of an abstract congruence closure is naturally extended to handle presence 
of associative-commutative operators, and this application is described in [3]. 
We believe that theories other than associativity and commutativity can also be 
incorporated with the inference rules for abstract congruence closure. 

Congruence closure has also been used to construct a convergent set of ground 
rewrite rules in polynomial time by Snyder [12] and other works. Plaisted et. 
al. [10] gave a direct method, not based on using congruence closure, for com- 
pleting a ground rewrite system in polynomial time. Hence our work completes 
the missing link, by showing that congruence closure is nothing but ground com- 
pletion. In fact, the process of transforming a set of rewrite rules over an extended 
signature (representing an abstract congruence closure) into a convergent set of 
rewrite rules over the original signature can be easily described by additional 
transition rules [3]. Our approach is different from that of Snyder, and can be 
used to obtain a more efficient implementation partly because Snyder’s algo- 
rithm needs two passes of the congruence closure algorithm, whereas we would 
need to compute the abstract congruence closure just once. 

The concept of an abstract congruence closure as detailed here and the rules 
for computation open up new frontiers too. For example, the transition rules 
presented in Section 2 can be naturally implemented in MAUDE [5] . Moreover, 
specific strategies, such as the ones presented in Section 3 can be encoded easily 
too. This might provide a basis for automatically verifying the correctness of 
congruence closure algorithms'^. 
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Abstract. We present a flexible framework for cooperating decision pro- 
cednres. We describe the properties needed to ensure correctness and 
show how it can be applied to implement an efficient version of Nel- 
son and Oppen’s algorithm for combining decision procedures. We also 
show how a Shostak style decision procedure can be implemented in the 
framework in such a way that it can be integrated with the Nelson-Oppen 
method. 



1 Introduction 

Decision procedures for fragments of first-order or higher-order logic are po- 
tentially of great interest because of their versatility. Many practical problems 
can be reduced to problems in some decidable theory. The availability of robust 
decision procedures that can solve these problem within reasonable time and 
memory could save a great deal of effort that would otherwise go into imple- 
menting special cases of these procedures. 

Indeed, there are several publicly distributed prototype implementations of 
decision procedures, such as Presburger arithmetic [15], and decidable combi- 
nations of quantifier- free first-order theories [2]. These and similar procedures 
have been used as components in applications, including interactive theorem 
provers [13,9], infinite-state model checkers [7,10,4], symbolic simulators [18], 
software specification checkers [14], and static program analyzers [8]. 

Nelson and Oppen [12] showed that satisfiability procedures for several the- 
ories that satisfy certain conditions can be combined into a single satisfiability 
procedure by propagating equalities. Many others have built upon this work, 
offering new proofs and applications [19,1]. 

Shostak [17,6,16] gave an alternative method for combining decision pro- 
cedures. His method is applicable to a more restricted set of theories, but is 
reported to be more efficient and is the basis for combination methods found 
in SVC [2], PVS [13], and STeP [9,3]. An understanding of his algorithm has 
proven to be elusive. 

Both STeP and PVS have at least some ability to combine the methods of 
Nelson and Oppen and Shostak [5,3], but not much detail has been given, and 
the methods used in PVS have never been published. As a result, there is still 
significant confusion about the relationship between these two methods and how 
to implement them efficiently and correctly. 
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Our experience with SVC, a decision procedure for quantifier-free first-order 
logic based loosely on Shostak’s method for combining cooperating decision pro- 
cedures, has been both positive and negative. On the one hand, it has been 
implemented and is efficient and reliable enough to enable new capabilities in 
our research group and at a surprisingly large number of other sites. However, 
efforts to extend and modify SVC have revealed unnecessary constraints in the 
underlying theory, as well as gaps in our understanding of it. 

This paper is an outcome of ongoing attempts to re-architect SVC to resolve 
these difficulties. We present an architecture for cooperating decision procedures 
that is simple yet flexible and show how the soundness, completeness, and ter- 
mination of the combined decision procedure can be proved from a small list of 
clearly stated assumptions about the constituent theories. As an example of the 
application of this framework, we show how it can be used to implement and 
integrate the methods of Nelson and Oppen and Shostak. In so doing, we also 
describe an optimization applicable to the original Nelson and Oppen procedure 
and show how our framework simplifies the proof of correctness of Shostak’s 
method. Due to the scope of this paper and space restrictions, many of the 
proofs have been abbreviated or omitted. 

2 Definitions and Notation 

Expressions in the framework are represented using the logical symbols true, 
false, and ‘=’, an arbitrary number of variables, and non-logical symbols con- 
sisting of constants, and function and predicate symbols. We call true and false 
constant formulas. An atomic formula is either a constant formula, an equality 
between terms, or a predicate applied to terms. A literal is either an atomic for- 
mula or an equality between a non-constant atomic formula and false. Equality 
with false is used to represent negation. Formulas include atomic formulas, and 
are closed under the application of equality, conjunction and quantifiers. An ex- 
pression is either a term or a formula. An expression is a leaf if it is a variable or 
constant. Otherwise, it is a compound expression, containing an operator applied 
to one or more children. 

A theory is a set of first-order sentences. For the purposes of this paper, 
we assume that all theories include the axioms of equality. The signature of a 
theory is the set of function, predicate, and constant symbols appearing in those 
sentences. The language of a signature S is the set of all expressions whose 
function, predicate, and constant symbols come from S. Given a theory T with 
signature S, if (/) is a sentence in the language of E, then we write T \= <j) to 
mean that every model of T is also a model of (j). For a given model, M , an 
interpretation is a function which assigns an element of the domain of M to 
each variable. If T is a set of formulas and (/) is a formula, then we write P \= <j) 
to mean that for every model and interpretation satisfying each formula in P, 
the same model and interpretation satisfy (f>. Finally, if is a set of formulas, 
then P \= <P indicates that P \= 4> for each (f> in <P. 

Expressions are represented using a directed acyclic graph (DAG) data struc- 
ture such that any two expressions which are syntactically identical are uniquely 
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represented by a single DAG. The following operations on expressions are sup- 
ported. 

Dp(e) the operator of e (just e itself if e is a leaf) . 

e [i] the child of e, where e [1] is the first child. 

If ei and 62 are expressions, then we write ei = 62 to indicate that ei and 62 
are the same expression (syntactically identical). In contrast, ei = 62 is simply 
intended to represent the expression formed by applying the equality operator 
to ei and 62. Expressions can be annotated with various attributes. If a is an 
attribute, e.a is the value of that attribute for expression e. Initially, e.a = _L 
for each e and a, where _L is a special undefined value. 

The following simple operations make use of an expression attribute called 
find to maintain equivalence classes of expressions. We assume that these are 
the only functions that reference the attribute. Note that when presenting pseu- 
docode here and below, some required preconditions may be given next to the 
name and parameters of the function. 

HasFind(a) SetFind(a) {a. find = T } 

RETURN a. find / T; a. find := a; 

Find(a) {HasFind(a)} Union(a,b) {a. find = a A b.find = b } 

IF (a. find = a) THEN RETURN a; a. find := b.find; 

ELSE RETURN Find(a. f ind) ; 

In some similar algorithms, e.find is initially set to e, rather than T. The 
reason we don’t do this is that it turns out to be convenient to use an initialized 
find attribute as a marker that the expression has been seen before. This not 
only simplifies the algorithm, but it also makes it easier to describe certain 
invariants about expressions. 

The find attribute induces a relation ~ on expressions: a ~ b if and only if 
HasFind(a) A HasFind(b) A [Find(a)=Find(b)]. For the set of all expressions 
whose find attributes have been set, this relation is an equivalence relation. 
The find database, denoted by IF, is defined as follows: a = b€lFifFa~b. The 
following facts will be used below. 

Find Database Monotonicity. If the preconditions for SetFind and Union 

are met, then if T is the find database at some previous time and T' is the find 
database now, then T T' . 

Find Lemma. If the preconditions for Find, SetFind, and Union hold, then 
Find always terminates. 

3 The Basic Framework 

As mentioned above, the purpose of the framework presented in this paper is to 
combine satisfiability procedures for several first-order theories into a satisfiabil- 
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ity procedure for their union. Suppose that Ti, . . . , T„ are n first-order theories, 
with signatures Si,. . .Sn- Let T =\jTi and S = [jSi. The goal is to provide 
a framework for a satisfiability procedure which determines the satisfiability in 
T of a set of formulas in the language of E. Our approach follows that of Nelson 
and Oppen [12]. We assume that the intersection of any two signatures is empty 
and that each theory is stably-infinite. A theory T with signature E is called 
stably-infinite if any quantifier-free formula in the language of E is satisfiable 
in T only if it is satisfiable in an infinite model of T. We also assume that the 
theories are convex. A theory is convex if there is no conjunction of literals in the 
language of the theory which implies a disjunction of equalities without implying 
one of the equalities itself. 

The interface to the framework from a client program consists of three meth- 
ods: AddFormula, Satisfiable, and Simplify. Conceptually, AddFormula adds 
its argument (which must be a literal) to a set A, called the assumption history. 
Simplify transforms an expression into a new expression which is equivalent 
modulo TUA, and Satisfiable returns false if and only if TUA ^ false. Since 
any quantifier-free formula can be converted to disjunctive normal form, after 
which each conjunction of literals can be checked separately for satisfiability, the 
restriction that the arguments to AddFormula be literals does not restrict the 
power of framework. 

The framework includes sets of functions which are parameterized by theory. 
For example, if f is such a function, we denote by f i the instance of f associated 
with theory Ti. If for some f and Ti, we do not explicitly define the instance f i, it 
is assumed that a call to f i does nothing. It is convenient to be able to call these 
functions based on the theory associated with some expression e. Expressions 
are associated with theories as follows. First, variables are partitioned among 
the theories arbitrarily. In some cases, one choice may be better than another, 
as discussed in Sec. 5.1 below. An expression in the language of E is associated 
with theory Ti if and only if it is a variable associated with Ti, its operator is a 
symbol in Ei, or it is an equality and its left side is associated with theory Ti. If 
an expression is associated with theory Ti, we call it an z-expression. We denote 
by T(e) the index z, where e is an z-expression. 

Figure 1 shows pseudocode for the basic framework. An input formula is first 
simplified it because it might already be known or reduce to something easier to 
handle. Simplification involves the recursive application of F ind as well as certain 
rewrite rules. Assert calls Merge which merges two ^-equivalence classes. Merge 
first calls Setup which ensures that the expressions are in an equivalence class. 

There are four places in the framework in which theory- specific functional- 
ity can be introduced. TheorySetup, TheoryRewrite and PropagateEqualities 
are theory-parameterized functions. Also, each expression has a notify attribute 
containing a set of pairs (f ,d), where / is a function and d is some data. When- 
ever Merge is called on an expression a = b, the find attribute of a changes to 
b, and f (a = b,d) is called for each (f ,d) S a. notify. Typically, TheorySetup 
adds callback functions to the notify attribute of various expressions to guar- 
antee that the theory’s satisfiability procedure will be notified if one of those 
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AddFormula(e) { e is a literal } 

Assert (e) ; 

REPEAT 

done := true; 

FQREACH theory Ti DO IF PropagateEqualities^ () THEN done := false; 
UNTIL done; 

Assert (e) { e is a literal; T VJ A. |= e } 

IF ^SatisfiableO THEN RETURN; 
e’ := Simplify (e) ; 

IF e’ = true THEN RETURN; 

IF Dp(e’) / ‘=’ THEN e’ := (e’ = true); 

Merge (e ’ ) ; 

Merge(e) { Op(e) = ‘=’; T U A |= e; see text for others } 

Setup (e [1] ) ; Setup (e [2] ) ; 

IF e [1] and e [2] are terms THEN TheorySetupy^-^j (e) ; 

UnionCe [1] ,e [2] ) ; 

FQREACH (f,d) £ e [1] .notify DO f(e,d); 

Setup (e) 

IF HasFind(e) THEN RETURN; 

FQREACH child c of e DQ Setup(c) ; 

TheorySetupj,(-gj (e) ; 

SetFind(e) ; 

Simplify (e) 

IF HasFind(e) THEN RETURN Find(e) ; 

Replace each child c of e with Simplify(c); 

RETURN Rewrite (e); 

Rewrite (e) 

IF HasFind(e) THEN RETURN Find(e) ; 

IF Qp(e) = ‘=’ THEN e’ := RewriteEquality (e) ; 

ELSE e’ := TheoryRewritej,^^j (e) ; 

IF e ^ e’ THEN e’ := Rewrite(e’); 

RETURN e ’ ; 

RewriteEquality (e) 

IF e[l] = e[2] THEN RETURN true; 

IF one child of e is true THEN RETURN the other child; 

IF e[l] = false THEN RETURN (e [2] = e[l]); 

RETURN e; 

Satisf iableO 

RETURN true 9^ false; 



Fig. 1. Basic Framework 
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expressions is merged with another expression. Finally, before returning from 
AddFormulcL, each theory may notify the framework of additional equalities it 
has deduced until each theory reports that there are no more equalities to prop- 
agate. 

Theory-specific code is distinguished from the framework code shown in Fig. 
1 and from user code which is the rest of the program. It may call functions 
in the framework, provided any required preconditions are met. Examples of 
theory-specific code for both Nelson-Oppen and Shostak style theories are given 
below, following a discussion of the abstract requirements which must be fulfilled 
by theory-specific code to ensure correctness. 

4 Correctness of the Basic Framework 

In order to prove correctness, we give a specification in terms of preconditions and 
postconditions and show that the framework meets the specification. Sometimes 
it is necessary to talk about the state of the program. Each run of a program 
is considered to be a sequence of states, where a state includes a value for each 
variable in the program and a location in the code. 



4.1 Preconditions and Postconditions 

The preconditions for each function in the framework except for Merge are shown 
in the pseudocode. In order to give the precondition for Merge, a few definitions 
are required. 

A path from an expression e to a sub-expression s of e is a sequence of 
expressions eo, ei, ..., e„ such that eg = e, e^+i is a child of e^, and s is a child 
of e„. A sub-expression s of an expression e is called a highest find-initialized 
sub-expression of e if HasFind(s) and there is a path from e to s such that 
for each expression e’ on the path, ^HasFindCe ’ ) . An expression e is called 
find-reduced if Find(s) = s for each highest find-initialized sub-expression s of 
e. 

An expression e is called merge-acceptable if e is an equation and one of the 
following holds: e is a literal; e[l] is false or an atomic predicate and e[2] = true; 
or e[l] = true and e[2] = false. 

Merge Precondition. 

Whenever Merge (e) is called, the following must hold. 

1. e is merge- accept able, 

2. e[l] and e[2] are find-reduced, 

3. e[l] ^ e[2], and 

4. TU A h e. 



In addition to the preconditions, the following postconditions must be satisfied 
by the parameterized functions. 
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TheoryRewrite Postcondition. 

After e’ := TheoryRewrite (e) or e’ := RewriteEquality (e) is executed, 
the following must hold: 

1. IF is unchanged by the call, 

2 . if e is a literal, then e’ is a literal, 

3. if e is find-reduced, then HasFind(e’) or e’ is find-reduced, and 

4. TUlF^e = e’. 

TheorySetup Postcondition. 

After TheorySetup is executed, the find database is unchanged. 

If all preconditions and postconditions hold for all functions called so far, we say 
that the program is in an uncorrupted state. Also, if true 7 ^- false, we say the 
program is in a consistent state. A few lemmas are required before proving that 
the preconditions and postconditions hold for the framework code. 

Lemma 1. If the program is in an uncorrupted state and Union(a,b) has been 
called, then since that call there have been no calls to Union where either argu- 
ment was a. 

Proof. Once Union(a,b) is called, a. find ^ a and this remains true since it 
can never again be an argument to SetFind or Union. 

Lemma 2 (Equality Find Lemma). // e = a = b and the program is in 
an uncorrupted and consistent state whose location is not between the call to 
SetFind(e) and the next call to Union and HasFind(e), then a and b are terms 
and Find(e) = false. 

Proof. Suppose HasFind(e). Then Setup (e) was called. But by the defini- 
tion of merge- accept able, this can only happen if e[l] and e[ 2 ] are terms and 
Merge (e = false) was called, in which case Uni on (e, false) is called immedi- 
ately afterwards. It is clear from the definition of merge-acceptable, that Union 
is never called with first argument false unless the second argument is true. 
Thus, if true 7 ^ false, it follows from Lemma 1 that Find(e) = false. □ 



Lemma 3 (Literal Find Lemma). If the program is in an uncorrupted state 
and e is a literal, then Find(e) is either e, true, or false. 

Proof. ^From the previous lemma, it follows that if e is an equality, then F ind (e) 
is either e, true, or false. A similar argument shows that the same is true for a 
predicate. □ 



Lemma 4 (Simplify Lemma). 

If the program is in an uncorrupted state after e’ := Simplify (e) is executed, 
then following are true: 
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1. T is unchanged by the call, 

2. if e is a literal then e’ is a literal, 

3. if e is a literal or term, then e ’ is find-reduced, and 

4 - 



We must prove the following theorem. A similar theorem is required every time 
we introduce theory-specific code. 

Theorem 1. If the program is in an uncorrupted state located in the framework 
code, then the next state is also uncorrupted. 

Proof. 

Find Precondition: Find is called in two places by the framework. In each 
case, we check the precondition before calling it. 

SetFind Precondition: SetFind(e) is only called from Setup(e) which re- 
turns if HasFind(e). Otherwise, Setup performs a depth-first traversal of the 
expression and calls SetFind. It follows from the TheorySetup Postcondition 
and the fact that expressions are acyclic that the precondition is satisfied. 
Union Precondition: Union(a,b) is only called if Merge (a = b) is called first. 
By the Merge precondition, a and b are find-reduced. It is easy to see that after 
Setup (a) and Setup (b) are called. Find (a) = a and Find(b) = b. 
AddFormula Precondition: We assume that AddFormula is only called with 
literals. 

Assert Precondition: Assert (e) is only called from AddFormula. In this case, 
e G A, so it follows that TU A ^ e. 

Merge Precondition: Merge (e ’ ) is called from Assert (e) . We know that e is 
a literal, so by the Simplify Lemma, Simplify (e) is a literal and is find-reduced. 
It follows that e ’ is merge- accept able and e ’ [1] and e ’ [2] are find-reduced and 
unequal. ^From the Simplify Lemma, we can conclude that TUlF^e = e’.It 
follows from the soundness property (described next) that T U A |= e = e ’ . We 
know that T U A ^ e, so it follows that T U A |= e ’ . 

TheoryRewrite Postcondition: It is straight-forward to check that each of 
the requirements hold for RewriteEquality. 

□ 



4.2 Soundness 

The satisfiability procedure is sound if whenever the program state is incon- 
sistent, T U A ^ false. Soundness depends on the invariance of the following 
property. 

Soundness Property. T U A \= IF. 

Lemma 5. If the program is in an uncorrupted state, then the soundness prop- 
erty holds. 
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Proof. Initially, the find database is empty. New formulas are added in two 
places. The first is in Setup, when SetFindis called. This preserves the soundness 
property since it only adds a reflexive formula to T . The other is in Merge (e), 
when UnionCe [1] ,e [2] ) is called. This adds the formula e to T , but we know 
that r U Al 1= e by the Merge Precondition. It also results in the addition of any 
formulas which can be deduced using transitivity and symmetry, but these are 
also entailed because T includes equality. □ 



Theorem 2. If the program is in an uncorrupted state, then the satisfiability 
procedure is sound. 

Proof. Suppose Satisf iable returns false. This means that true ~ false. It fol- 
lows from the previous lemma that T U A \= true = false, so T U Al \= false. 

□ 



4.3 Completeness 

The satisfiability procedure is complete if T U Al is satisfiable whenever the pro- 
gram is in a consistent state in the user code. 

We define the merge database, denoted M, as the set of all expressions e such 
that there has been a call to Merge (e). In order to describe the property which 
must hold for completeness, we first introduce a few definitions, adapted from 
[19]. 

Recall that an expression in the language of E is an z-expression if it is a 
variable associated with Ti, its operator is a symbol in Si, or it is an equality 
and its left side is an z-expression. A sub-expression of e is called an i-leaf if it 
is a variable or a j-expression, with j yf z, and every expression along some path 
from e is an z-expression. An z-leaf is an i-alien if it is not an z-expression. An 
z-expression in which every z-leaf is a variable is called pure (or i-pure). 

With each term t which is not a variable, we associate a fresh variable f(t). 
We define z;(t) to be t when t is a variable. For some expression or set of 
expressions S, we define 7i(5) by replacing all of the z-alien terms t in S' by 
z>(t)^ so that every expression in 7 i(S) is z-pure. We denote by 7 o(S) the set 
obtained from S by replacing all maximal terms (i.e. terms without any super- 
terms) t by f(t). Let O be the set of all equations t = f(t), where t is a sub-term 
of some formula in Ad . It is easy to see that T U AI is satisfiable iff T U AI U 6> 
is satisfiable. 

Let A^i = {e|e€AlAeisan z-expression }. Define Ot similarly. Notice that 
{M U 0) is logically equivalent to IJ ^i{Mi U Ot), since each can be transformed 
into the other by repeated substitutions. 

^ Since expressions are DAG’s, we must be careful about what is meant by replacing a 
sub-expression. The intended meaning here and throughout is that the expression is 
considered as a tree, and only occurrences of the term which qualify for replacement 
in the tree are replaced. This means that some occurrences may not be replaced at 
all, and the resulting DAG may look significantly different as a result. 
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We define V , the set of shared terms as the set of all terms t such that w(t) 
appears in at least two distinct sets ji(A4i U &i), 1 < i < n. Let E(V) = {a = 
b| a, beyAa~ b}, and let D(V) = {a b | a, b G A a 7 ^ b}. For a set of 
expressions S, an arrangement 7r(S) is a set such that for every two expressions 
a and b in S, exactly one of a = b or a yf b is in Tr(S'). We denote by 7 r(y) 
the arrangement B(V) U D{V) of V determined by Now we can state the 
property required for completeness. 

Completeness Property. If the program is in a consistent state in the user 
code, then Ti U ^i{Mi U tt{V)) is satisfiable. 

The following lemmas are needed before proving completeness. 

Lemma 6. If the program is in an uncorrupted state, then T \J Jv[ \= T 

Proof. Every formula in IF is either in M or can be derived from formulas in M 
using reflexivity, symmetry, and transitivity of equality. □ 



Lemma 7. If the program is in an uncorrupted and consistent state in the user 
code, then T LI A4 \= A. 

Proof. Suppose e € A. Then we know that Assert (e) was called at some time 
previously. We can conclude by monotonicity of the find database that true 7^ 
false at the time of that call. Thus, e’ := Simplify (e) was executed. By the 
Simplify Lemma, if iFi was the find database at the time of the call, T L Ti \= 
e — e’. Now, if e' = true, then TL Ti ^ e and so by monotonicity and Lemma 
6, T L Ai \= e. Otherwise, Merge is called. Let x be the argument to Merge. It 
is easy to see that T L !Fi ^ e = x. But x G Ad, so T U Ad ^ x. It then follows 
easily by monotonicity and Lemma 6 that T U Ad ^ e. □ 

The following theorem is from [19]. 

Theorem 3. Let T\ and T 2 be two stably-infinite, signature-disjoint theories 
and let <f>i be a set of formulas in the language of T\ and (f >2 0 , set of formulas 
in the language of T 2 . Let v be the set of their shared variables and let 7 t(v) be 
an arrangement of v. If 4>i A tt{v) is satisfiable in Ti for i = 1, 2, then 4>i A 4>2 
is satisfiable in T 1 LT 2 . 

Theorem 4. If the procedure always maintains an uncorrupted state and the 
completeness property holds for each theory, then the procedure is complete. 

Proof. Suppose that for a consistent state in the user code, Ti U ^i{Mi U t^{V)) 
is satisfiable for each i. This implies that Ti U 7 i(Adi U U 7r(y)) is satisfiable 
(since each equation in Oi simply defines a new variable), which is logically 
equivalent (by applying substitutions from Of) to Ti U ^i{Mi U Oi) U 7o(7r(y)). 
Now, each set ji(A4i U Oi) is a set of formulas in the language of Ti, and 7o(7r(y)) 
is an arrangement of the variables shared among these sets, so we can conclude 
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by repeated application of Theorem 3 that IJ 7 i(Af i U 6>i) is satisfiable in T. 
But U U 6*i) is equivalent to Ad U 6* which is satisfiable in T iff T U Ad is 

satisfiable. Finally, by Lemma 7, T U Ad \= A. Thus we can conclude that TUA 
is satisfiable. □ 



4.4 Termination 

We must show that each function in the framework terminates. The following 
requirements guarantee this. 

Termination Requirements. 

1. The preconditions for Find, SetFind, and Union always hold. 

2. For each z-expression e, TheoryRewritGj (e) terminates. 

3. If s is a sequence of expressions in which the next member of the sequence 
e’ is formed from the previous member e by calling TheoryRewritej (e) , 
then beyond some element of the sequence, all the expressions are identical. 

4. For each z-expression e, TheorySetup^ (e) terminates. 

5. After Union(a,b) is called, 

(a) No new entries are added to a. notify. 

(b) Each call to each funtion in a. notify terminates. 

6. For each theory Tj, PropagateEqualitieSj terminates and after calling 
PropagateEqualitieSj some finite number of times, it will always return 
false. 

Theorem 5. If the termination requirements hold, then each function in the 
framework terminates. 

Proof. The first condition guarantees that Find terminates, from which it follows 
that Satisfiable terminates. The next two ensure that Rewrite terminates. It 
then follows easily that Simplify must terminate. The next few conditions are 
sufficient to ensure that Setup and Merge terminate, from which it follows that 
Assert terminates. This, together with the last condition allows us to conclude 
that AddFormula terminates. □ 

It is not hard to see that without any theory-specific code, these requirements 
hold. 



5 Examples Using the Framework 

In this section we will give two examples to show how the framework can ac- 
commodate different kinds of theory-specific code. 
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5.1 Nelson Oppen Theories 

A Nelson-Oppen style satisfiability procedure for a theory Ti must be able to 
determine the satisfiability of a set of formulas in the language of Si as well 
as which equalities between variables are entailed by that set of formulas [12]. 
We present a method for integrating such theories which is flexible and efficient. 
Suppose we have a Nelson-Oppen style satisfiability procedure which treats alien 
terms as variables with the following methods: 

AddFormulai Adds a new formula to the set A-i . 

Satisfiable; True iff Ti U 7i(Ai) is satisfiable. 

AddTermToPropagate^ Adds a term to the set Ai . 

GetEqualities^ Returns the largest set of equalities £i between terms 

in Ai such that Ti U 7i(Ai) |= ')i{£i) ■ 

A new expression attribute, shared is used to keep track of which terms 
are relevant to more than one theory. Each theory is given an index, i, and the 
shared attribute is set to i if the term is used by theory i. If more than one 
theory uses the term, the shared attribute is set to 0. This is encapsulated in 
the SetShared and IsShared methods shown below. 

SetSharedCe, i) IsShared(e) 

IF e . shared = _L THEN e . shared : = i ; RETURN e . shared = 0 ; 

ELSE IF e . shared 7^ i THEN e . shared : = 0 ; 

AddTermToPropagate^ (e) ; 

Figure 2 shows the theory-specific code needed to add a theory Ti with a satis- 
fiability procedure as described above. We will refer to a theory implemented in 
this way as a Nelson-Oppen theory. Each z-expression is passed to TheorySetupj. 
TheorySetupj marks these terms and their alien children as used by Ti. It also 
ensures that Notify^ will be called if any of these expressions are merged with 
something else. When Notif y^ is called, the formula is passed along to the satis- 
fiability procedure for Ti . These steps correspond to the decomposition into pure 
formulas in other implementations (but without the introduction of additional 
variables). PropagateEqualitieSj asserts any equations between shared terms 
that have been deduced by the satisfiability procedure for Ti. This corresponds 
to the equality propagation step in other methods. It is sufficient to propagate 
equalities between shared variables, a fact also noted in [19]. 

We also introduce a new optimization. Not all theories need to know about 
all equalities between shared terms. A theory is only notified of an equality if the 
left side of that equality is a term that it has seen before. In order to guarantee 
that this results in fewer propagations, we have to ensure that whenever an 
equality between two terms is in AI, if one of the terms is not shared, then the 
left term is not shared. We can easily do this by modifying RewriteEquality to 
put non-shared terms on the left. However, this is not necessary for correctness, a 
fact which allows the integration of Shostak-style satisfiability procedures which 
require a different implementation of RewriteEquality as described in Sec. 5.2 
below. 
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TheorySetup^ (e) 

FOREACH j-alien child a of e DO BEGIN 
a. notify := a. notify U { (Notify^, 0) }; 

SetShared(a,i) ; 

END 

e. notify := e. notify U { (Notify^, 0) }; 

IF e is a term THEN SetShared(e,i) ; 

TheoryRewrite^ (e) 

RETURN e; 

PropagateEqualities^ () 
propagate := false; 

IF SatisfiableO BEGIN 

IF ^ Satisf iablei () THEN MergeCtrue = false)); 

ELSE FOREACH x = y G GetEqualities^ DO 

IF IsShared(x) AND IsShared(y) AND x 7 ^ y THEN BEGIN 
propagate := true; 

Assert (x = y) ) ; 

END 

END 

RETURN propagate; 

Notify^ (e) 

IF e [1] is an i-alien term THEN BEGIN 
X : = Find(e [2] ) ; 

X. notify := x. notify U { {Notify^, 0) }; 
e := (e [ 1 ] = x) ; 

END 

AddFormulai (e) ; 

Fig. 2. Code for implementing a Nelson-Oppen theory Ti. 



A final optimization is to associate variable with theories in such a way as 
to to avoid causing terms to be shared unnecessarily. For example, if x = t is 
a formula in M and x is a variable and t is an z-term, it is desirable for x to 
be an z-term as well (otherwise, t immediately becomes a shared term). In our 
implementation, expressions are type-checked and each type is associated with 
a theory. Thus, we can easily guarantee this by associating x with the theory 
associated with its type. 



Correctness. The proof of the following theorem is similar to that given for 
the framework code and is omitted. 



Theorem 6. If the program is in an uncorrupted state located in the theory- 
specific code for a Nelson-Oppen theory, then the next state is also uncorrupted. 
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To show that the completeness property holds, we must show that if the 
program is in a consistent state in the user code, then Ti U U 7r(y)) is 

satisfiable. This requires the following invariant to hold for each theory Ti. 

Shared Term Requirement. There has been a call to SetSharedCe , i) if u(e) 
appears in U Ot). 

Lemma 8. If Ti is a Nelson-Oppen theory, then the shared term requirement 
holds for Ti . 



Corollary 1. IfTi is a Nelson-Oppen theory, and u(t) appears in 'ji(A4i U 0i), 
then t S Zli- 

Let Z\' = U {x I X is a term and t = x G Ai for some term t}. 

Lemma 9. IfTi is a Nelson-Oppen theory and the program is in an uneorrupted 
state in the user eode and x = y G A4, where x G A[, then x = z e Ai, where 
z = Find(y) at some previous time. 

Proof. Suppose x G Ai. Then SetShared was called. It is easy to see from the 
code that at the time it was called, Notify^ was added to x. notify. If on the 
other hand, x ^ Ai, then t = x G Ai for some t which is not an z-term. But 
then, when t = x was added to Ai, Notify^ was added to x. notify. In each 
case, Notify^Cx = y) will be called after Merge (x = y) is called, so that x = 
Find(y) is added to Ai. □ 

Lemma 10. If Ti is a Nelson-Oppen theory and the program is in an uncor- 
rupted state in the user eode and x ~ y, where x, y € A[, then Ti U 'yi{Ai) ^ 
lii.x = y). 

Proof. We can show by the previous lemma that since Find(x) = Find(y) , there 
is a chain of equalities in Ai linking x to y. □ 

Let T>i = (a h | a, b € (Ai n C)}, and let D' = {a yf b | a, b e (Z\' H V)}. 

Lemma 11. If Ti is a Nelson-Oppen theory and the program is in an uneor- 
rupted and consistent state in the user code, then Ti \J^i(Ai U Df) is satisfiable. 

Proof. No single disequality x ^ y G Di can be inconsistent because if it 
were, that would mean Ti U 'yi(Ai) ^ 7 i(a; = y). But if this is the case, since 
PropagateEqualitieSj terminated, it must be the case that x ~ y. Since no 
single equality x = y is entailed, it follows from the convexity of Ti, that no 
disjunction of equalities can be entailed. □ 



Lemma 12. If Ti is a Nelson-Oppen theory and the program is in an uneor- 
rupted and consistent state in the user code, then Ti \J^i(Ai U D[) is satisfiable. 
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Proof. If t 2 G D[, we can find (by the definition of A'f) some ti and t 2 

such that ti yf t2 G -Di and Ai (= (ti = t( A t2 = . The result follows by the 

previous lemma. □ 

Theorem 7. If each theory satisfies the shared term requirement and the pro- 
gram is in an uneorrupted and consistent state in the user code, then ifTi is a 
Nelson-Oppen theory, the completeness property holds for Ti. 

Proof. It is not hard to show that if f(x) G ^i{Ai U Oi), then x G Z\'. It then 
follows that an interpretation satisfying Ti U jfiAi U Df) can be modified to also 
satisfy 7i(7r(y)). □ 

Termination. The only termination condition that is non-trivial is the last one. 
The following requirement is sufficient to fulfill this condition. 

Nelson Oppen Termination Requirement 

Suppose that before a call to Assert from PropagateEqualitieSj, n is the 
number of equivalence classes in ~ containing at least one term t G V . Then, 
either the state following the call to Assert is inconsistent or if m is the number 
of equivalence classes in ~ containing at least one term t € V after returning 
from Assert, m < n. 

If every theory is a Nelson-Oppen theory, it is not hard to see that this require- 
ment holds. This is because each call to Assert merges the equivalence classes 
of two shared variables without creating any new ones. 

5.2 Adding Shostak Theories 

Suppose we have a theory Ti with no predicate symbols which provides two 
functions, a and oj which we refer to as the canonizer and solver respectively. 
Note that if we have more than one such theory, we can often combine the 
canonizers and solvers to form a canonizer and solver for the combined theory, 
as described in [17]^. The functions a and lo have the following properties. 

(T is a canonizer for Ti if 

1- Ti ^ 7 i(a = b) iff (j(a) = (j(b) 

2. = <T(t) for all terms t. 

3. 7 i(<T(t)) contains only variables occurring in 7 i(t). 

4. (j(t) = t if t is a variable or not an z-term. 

5. If cr(t) is a compound z-term, then (t(x) = x for each child x of cr(t). 
w is a solver^ for Ti if 

^ Although it has been claimed that solvers can always be combined to form a solver 
for the combined theory [6,17], this is not always possible, as pointed out in [11] 

® Shostak allows the solved form to be more general. To simplify the presentation, we 
assume the solver returns a single, logically equivalent, equation. 
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1. If Ti \= ^i{x ^ y) then w(x = y) = false. 

2. Otherwise, w(x = y) = a = b where a and b are terms, 

3. Ti\={x = y) ^ {a = b), 

4. 7i(a) is a variable and does not appear in 7i(6), 

5. neither 7i(a) nor 7i(b) contain variables not occurring in 7^(x = y), 

6. u>{a = b) = a = b and a{b) = b. 

We call such a theory a Shostak theory. The code in Fig. 3 shows the additional 
code needed to integrate a Shostak theory. 



RewriteEquality(e) 

IF e[l] = e[2] THEN RETURN true; 

IF one child of e is true THEN RETURN the other child; 

IF e[l] = false THEN RETURN (e[2] = e[l]); 

IF e[l] is a term THEN RETURN w(e); 

RETURN e; 

TheorySetup^ (e) 

FDREACH a which is an i-leaf in e DO BEGIN 

IF Qp(e) = ‘=’ THEN a. notify := a. notify U {{UpdateDisequality , e)} ; 
ELSE a. notify := a. notify U {{UpdateShostak,e)} ; 

SetShared(a,i) ; 

END 

IF e is a term THEN SetSharedCe , i) ; 

TheoryRewrite^ (e) 

RETURN cr(e) ; 

PropagateEqualities^ O 
RETURN false; 

UpdateDisequality (x , y) 

IF ^SatisfiableO V ^HasFind(y) THEN RETURN; 

Replace each i-leaf c in y with Find(c) ; 
y’ := Rewrite (y) ; 

IF y’ ^ false THEN Merge (y’ = false); 

UpdateShostak(x,y) 

IF Find(y) = y THEN BEGIN 

Replace each i-leaf c in y with Find(c) to get y’ ; 

Merge (y = cr(y’)); 

END 



Fig. 3. Code for implementing a Shostak theory T^. 
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Correctness. It is not hard to show that this code satisfies the preconditions 
and requirements of the framework. 

Theorem 8. If the program is in an uncorrupted state located in the theory- 
specific code for a Shostak theory, then the next state is also uncorrupted. 

Included in the Shostak code are the calls to SetShared necessary to allow this 
theory to be integrated with Nelson-Oppen theories. We have not included the 
code typically included for handling uninterpreted functions. This is because our 
approach allows us to consider uninterpreted functions as belonging to a separate 
Nelson-Oppen theory. Though we do not show how in this paper, any simple con- 
gruence closure algorithm can be integrated as a Nelson-Oppen theory. Omitting 
details related to uninterpreted functions simplifies the presentation and proof. 
We have also included code for handling disequalities, which Shostak’s original 
procedure does not handle directly. We will give some intuition for how this 
works after making a few definitions. 

Let Z\i = {t I t is an z-leaf in some expression e G A4}. Let £ = {a = 
b I aG Z\ iAb = Find(a) }. For an expression e, define 'r(e) to be the expression 
obtained from e by replacing each z-leaf x in e by Find(x). Shostak’s method 
works by ensuring that Find(t) = (r(r(t)). This together with the properties 
of the solver ensure that the set E is equivalent to a substitution, meaning it is 
easily satisfiable. These are the key ideas of the completeness argument. 

Lemma 13. If the program is in an uncorrupted and consistent state which 
is not inside of a call to Merge, then for each term t such that HasFind(t), 
Find(t) = <T(r(t)). Also, z/Find(t) = t, then rft) = t. 

Proof. When SetFind is first called on an expression e, the Merge preconditions 
together with the solver and canonizer guarantee that e = (r(r(e)). Then, when- 
ever an z-leaf is merged, UpdateShostak is called to preserve the invariant. □ 



Lemma 14. If the program is in an uncorrupted and consistent state in the user 
code, and Ti is a Shostak theory, then Ti U '^i{E) is satisfiable. 

Proof. Let M be a model of Ti, and let x G Ai. If Find(x) = x, then assign z;(x) 
an arbitrary value. Otherwise, assign z;(x) the same value as 7 i(Find(x)). By the 
above lemma, this assignment satisfies ji{E). □ 



Lemma 15. If the program is in an uncorrupted and consistent state in the user 
code and Ti is a Shostak theory, then Ti U ji{A4i) is satisfiable. 

Proof. Suppose e G Mi. Clearly e[l] ~ e[2]. If e is an equality between terms, 
it follows from Lemma 13 that (j(r(e[l])) = (j(r(e[2])). By properties of a, it 
follows that Ti \= r(e[l]) = r(e[2]). Then, by the definition of E, it follows 
that Ti U £ 1= e[l] = e[2] and hence Ti U ^i{E) \= 7 i(e[l] = e[2]). Suppose 
on the other hand that e is the literal (x = y) = false, and suppose that 
Ti U 7 i(iS) ^ 7 i(x = y). The same argument as above in reverse shows that 
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Find(x) = Find(y). The UpdateDisequality code ensures that in this case 
true will get merged with false, contradicting the assumption that the state 
is consistent. Thus, U 7i(£) ^ 7i(x = y). Since Ti is convex, it follows that 
Ti U 7i(£ U Mi) is satisfiable. □ 

Theorem 9. If the program is in an uncorrupted and consistent state in the 
user code and Ti is a Shostak theory, then the completeness property holds for 
T,. 

Proof. The above lemma shows that Ti U 7^(5 U A^i) is satisfiable. Suppose a 
and b are shared terms. If a ~ b, a similar argument to that given above shows 
that Ti U 7i(£) \= 7i(a = b). If, on the other hand a b, it follows easily 
that Ti U ji{£) 7i(a = b). Since each equality in ^i{Mi U t^{V)) is entailed 
by Ti U 7i(i£) and none of the disequalities are, it follows by convexity that 
Ti U ji{Mi U 7r(y)) is satisfiable. □ 

Termination. The idempotency of the solver and canonizer are sufficient to 
guarantee termination of rewrites. For each expression e, it is not hard to show 
that something is added to e .notify only if Find (e) = e. Consider the functions 
called by Merge which are UpdateDisequality and UpdateShostak. Both of 
them call Merge recursively. Each of them reduce the value of some measure of the 
program state. For UpdateDisequality, the measure is the number of equality 
expressions e such that HasFind(e) and w(r(e)) ^ false. For UpdateShostak, 
the measure is the number of expressions e such that Find(e) = e and Find(c) 
^ c for some z-leaf c of e. With some effort, it can be verified that none of the 
functions in the theory-specific code presented thus far which can be called after 
Union increase either of these measures. The other termination conditions are 
trivial. 

Finally, in order to combine Shostak and Nelson-Oppen, the Shostak code 
must not break the Nelson-Oppen Termination Requirement. Any new call to 
Merge has the potential to “create” new shared terms by causing a new term to 
show up in Mi for some i. A careful analysis shows that if Assert (x = y) is 
called from the Nelson-Oppen code, any resulting call to Merge does not increase 
the number of equivalence classes containing shared terms. Lemma 13 ensures 
that by the time Assert has returned, x ~ y, so the number of equivalence 
classes containing shared terms decreases as required. 

6 Conclusion 

We have presented a framework for combining decision procedures for disjoint 
first-order theories, and shown how it can be used to implement and integrate 
Nelson-Oppen and Shostak style decision procedures. 

This work has shed considerable light on the individual methods as well as 
on what is required to combine them. We discovered that a more restricted set 
of equalities can be propagated in the Nelson-Oppen framework without losing 
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completeness. Also, by separating the uninterpreted functions from the Shostak 
method, the code is simpler and easier to verify. 

We are working on an extension of the framework which would handle non- 
convex theories and more general Shostak solvers. In future work, we hope also 
to be able to relax the requirements that the theories be disjoint and stably- 
infinite. We also plan to complete and distribute a new version of SVC based on 
these results. 
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Abstract. The concept of locales for Isabelle enables local definition 
and assumption for interactive mechanical proofs. Furthermore, depen- 
dent types are constructed in Isabelle/HOL for first class representation 
of structure. These two concepts are introduced briefly. Although each 
of them has proved useful in itself, their real power lies in combination. 
This paper illustrates by examples from abstract algebra how this com- 
bination works and argues that it enables modular reasoning. 



1 Motivation 

Modules for theorem provers are a means for organizing theories of applications. 
Generic interactive theorem provers like PVS [OSRSC98], IMPS [FGT93], and 
HOL [GM93] define their applications as object logics. Modules are used to 
maintain and structure these object logics. Being a classical software engineering 
concept for re-usability and structuring, modules are the obvious method for 
organizing formalizations of theorem provers. 

Apart from just organizing big theories, advanced modular features — like pa- 
rameterization and instantiation — give rise to use modules to represent (math- 
ematical) structure logically. For example, the abstract algebraic structure of 
groups is represented by a module in the following fashion (cf. [OSRSG98]). 

Module Group [G: TYPE, o : G -> G -> G, inv: G -> G, e: G] 

Vx: G. xoe=x 

V x: G. X o (inv x) = e 

V X, y, z: G. X o (y o z) = (x o y) o z 

The abstract character of groups is modeled in systems like PVS by using 
(generic) sorts or explicit parameters to model the contents of the group. Reason- 
ing about properties of group elements and the operation o is possible inside such 
a theory. The parameterization enables the instantiation of the group theory to 
actual groups. The abstractly derived results can thus be reused by an instanti- 
ation. This is also what we think of as modular reasoning; reasoning where the 
abstraction and structuring of modules becomes part of the proof process. 

However, an adequate way of reasoning is not possible in this setting. For 
example, we must consider the class of all groups to enable reasoning about 
general properties which hold, say, only for finite groups. This class of all groups 
cannot be defined here because the theory level is separate from the reasoning 
level. There are more examples, like quotients of groups forming groups again; the 
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problem is always the same. Since modules are not first class citizens, we cannot 
use the structure defined by a module in any formula. Hence, formalizations 
using modules to represent mathematical structure are not adequate; we can 
only reason about a restricted set of aspects of the (mathematical) world. 

In rich type theories there is the concept of dependent types. Systems like 
Coq [D+93] and LEGO [LP92] implement such type theories. If the hierarchies 
of the type theory are rich enough then dependent types are first class citizens. 
Usually, type theories do not have advanced module concepts as they are known 
in interactive theorem provers, like PVS and IMPS. However, it is well known 
that dependent types may be used to represent modules (e.g. [Mac86]). 

We verified by case studies (e.g. [KP99]) that a module system where the 
modules are first class citizens is actually necessary for an adequate representa- 
tion of (mathematical) structures in the logic of a theorem prover. Yet, it turns 
out that we sometimes need just some form of local scope and not a first class rep- 
resentation. We need locality, i.e. the possibility to declare concepts whose scope 
is limited or temporary. Locality and adequacy are separate concerns that do 
not coincide generally. We propose to use separate devices, i.e. locales [KWP99] 
and dependent types [Kam99b]. We have designed and implemented them for 
Isabelle. In this paper, we show that in combination they realize modular rea- 
soning. 

In Section 2.1 we shortly introduce the concept of locales for Isabelle. The 
way we represent dependent types in Isabelle/HOL is sketched in Section 2.2. 
The introduction to these topics has been presented elsewhere and goes only as 
far as needed for the understanding of the following. In Section 3 we present 
various case studies. They illustrate the use of locales and dependent types and 
validate that the combination of these concepts enables modular reasoning. 

2 Prerequisites and Concepts 

Isabelle is a higher order logic theorem prover [Pau94]. It is generic, that is, it 
can be instantiated to form theorem provers for a wide range of logics. These 
can be made known to the prover by defining theories that contain sort and 
type declarations, constants, and related definitions and rules. The most popular 
object logics are Isabelle/HOL and Isabelle/ZF. A powerful parser supports 
intelligible syntactic abbreviations for user-defined constants. 

Definitions, rules, and other declarations that are contained in an Isabelle 
theory are visible whenever that theory is loaded into an Isabelle session. All 
theories on which the current theory is built are also visible. All entities contained 
in a current theory stay visible for any other theory that uses the current one. 
Thus, theory rules and definitions are not suited for formalizing concepts that 
are of only local significance in certain contexts or proofs. 

Isabelle theories form hierarchies. However, theories do not have any param- 
eters or other advanced features typical for modules in theorem provers. That 
is, Isabelle did not have a module concept prior to the developments presented 
in the current section. 
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2.1 Locales 

Locales [KWP99] declare a context of fixed variables, local assumptions and local 
definitions. Inside this context, theorems can be proved that may depend on the 
assumptions and definitions and the fixed variables are treated like constants. 
The result will then depend on the locale assumptions, while the definitions of 
a locale are eliminated. 

The definition of a locale is static, i.e. it resides in a theory. Nevertheless, 
there is a dynamic aspect of locales corresponding to the interactive side of 
Isabelle. Locales are by default inactive. If the current theory context of an 
Isabelle session contains a theory that entails locales, they can be invoked. The 
list of currently active locales is called scope. The process of activating them is 
called opening] the reverse is closing. 

Locales can be defined in a nested style, i.e. a new locale can be defined as 
the extension of an existing one. Locales realize a form of polymorphism with 
binding of type variables not normally possible in Isabelle (see Section 3.2). 

Theorems proved in the scope of a locale may be exported to the surrounding 
context. The exporting device for locales dissolves the contextual structure of 
a locale. Locale definitions become expanded, locale assumptions attached as 
individual assumptions, and locale constants transformed into variables that 
may be instantiated freely. That is, exporting reflects a locale to Isabelle’s meta- 
logic. Although they do not have a first class representation, locales have at least 
a meta-logical explanation. In Section 3.4 we will see that this is crucial for the 
sound combination of locales and dependent types. 

Locales are part of the official distribution since Isabelle version 98-1. They 
can be used in all of Isabelle’s object logics — not just Isabelle/HOL — and have 
been used already in many applications apart from the ones presented here. 



2.2 Dependent Types as First Class Modules 

In rich type theories, e.g. UTT [Bai98], groups can be represented as 

S G : set. S e : G. S o : map 2 G G G. S : map G G. group_axioms 

where group_axioms abbreviates the usual rules for groups, corresponding to the 
body of a module for groups. The elements G, e, o and correspond to the 
parameters of a module and occur in group_axioms. Since this A- type can be 
considered as a term in a higher type universe, we can use it in other formulas. 
Hence, this modular formalization of groups is adequate. 

Nai'vly, a A-type may be understood as the Cartesian product A x B and a 
7T-type as a function type A ^ B, but with the B having a “slot” of type A, i.e. 
being parameterized over an element of A. The latter part of this intuition gives 
rise to use these type constructors to model the parameterization, and hence the 
abstraction, of modules. 

Isabelle’s higher order logic does not have dependent types. For the first 
class representation of abstract algebraic structures we construct an embedding 
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of if- types and iT-types as typed sets into Isabelle/HOL using set-theoretic defi- 
nitions [Kam99b]. Since sets in Isabelle/HOL are constants, algebraic structures 
become first class citizens. Moreover, abstract structures that use other struc- 
tures as parameters may be modeled as well. We call such structures higher order 
structures. An example are group homomorphisms, i.e. maps from the carrier of 
a group G to the carrier of a group H that respect operations. 

Horn = S G £ Group. E H £ Group. 

{<P\<P£ G.{cr) H.{cr) A 

{yx,y£ G.{cr). ${G.{f) x y) = H.{f) ${x) ${y))} 

The postfix tags, like .(/) are field descriptors of the components of a structure. 
In general, E is used as a constructor for higher order structures. In some cases, 
however, a higher order structure is uniquely constructed, as for example the fac- 
torization of a group by one of its subgroups (see Section 3.3). In those cases, we 
use the iT-type. We define a set-typed A-notation that enables the construction 
of functions of a 7T-type. 



3 Locales + Dependent Types = Modnles 

The main idea of this work is that a combination of the concepts of locales 
and dependent types enables adequate representation and convenient proof with 
modular structures. To validate this hypothesis we present various case studies 
pointing out the improvements that are gained through the combination of the 
two concepts. Some basic formalizations in Section 3.1, explain the use and 
interaction of the two concepts. In Section 3.2, we reconsider the case study of 
Sylow’s theorem [KP99] that mainly illustrates the necessity of locales. Then, in 
Section 3.3, we discuss in detail the quotient of a group that clearly proves the 
need of the first class property of the dependent type representation. However, 
it illustrates as well how the additional use of locales enhances the reasoning. 
After summarizing other examples and analyzing the improvements we show in 
Section 3.4 how operations on structures may be performed with the combined 
use of locales and dependent types. 

3.1 Formalization of Group Theory 

Groups and Subgroups. The class of groups is defined as a set over a record 
type with four elements: the carrier, the binary operation, the inverse and the 
unit element that constitute a group. They can be referred to using the projec- 
tions G.<cr>, G.<f>, G.<inv>, and G.<e> for some group G. Since the class of 
all groups is a set, it is a first class citizen. We can write G € Group as a logical 
formula to express “G is a group” . Hence, all group properties can be derived 
from the definition. 

In the definition of the subgroup property we can use an elegant approach 
which reads informally: a subset H of G is a subgroup if it is a group with G’s 
operations. Only since groups are first class citizens, we can describe subgroups 
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in that way. A A-structure is used to model the subgroup relation. The (| [) 

enclosed quadruple constructs the subgroup as an element of the record type 
of the elements of Group. Our A-notation enables the restriction of the group 
operations to the subset H. 



Ai G G Group. {H | H C (G.<cr>) A 

d carrier = H, bin_op =AxGH. AyGH. (G.<f>) x y, 

inverse = A x G H. (G.<inv>) x, unit = (G.<e>) |) G Group} 



The convenient syntax H <<= G, for H is a, subgroup of G, may be used to 
abbreviate (G, H) G subgroup. 

In addition to the first class representation of groups and subgroups, we de- 
fine a locale group to provide a local proof context for group related proofs. 
The fixes, assumes, and defines parts introduce the constants with their 
polymorphic types, the assumptions and definitions of the locale^ . The ’ a is 
a polymorphic type variable (see Section 3.2). 



locale group = 
fixes 



(infixr "#" 80) 

("i (_)" [90]91) 

assumes 

Group_G "G G Group" 
defines 

e_def "e == (G.<e>)" 

binop_def "x # y == (G.<f>) x y" 
inv_def "i x == (G.<inv>) x" 



G 

e 

binop 



’a grouptype" 



This locale is attached to the theory file for groups. Prior to starting the proofs 
concerning groups, we open this locale and can subsequently use the syntax and 
the local assumption G G Group throughout all proofs for groups. This improves 
the readability of the derivations as well as it reduces the length of the proofs. 
For example, instead of 

[I G G Group; x G (G.<cr>); (G.<f>) x x = x |] ==> x = (G.<e>) 
we can state this theorem now as 



[| X G (G.<cr>); x # x = x |] ==> x = e 

Subgoals of the form G G Group that would normally be created in proofs are 
not there any more because they are now matched by the corresponding locale 
rule. All group related proofs share this assumption. Thus, the use of a locale 
rule reduces the length of the proofs. 

^ We omitted an abbreviation for G.<cr> to contrast it from the group G. 
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Cosets. To enable the proof of Sylow’s theorem and further results from group 
theory we define left and right cosets of a group, a product, and an inverse 
operation for subsets of groups. We create a separate theory for cosets named 
Coset containing their definitions. 

r_coset G H a = (Ax. (G.<f>) x a) ‘‘ H 
l_coset G a H = (Ax. (G.<f>) ax) ‘‘ H 
set_r_cos G H = r_coset G H ‘‘ (G.<cr>) 
set_inv G H = (A x. (G.<inv>) x) ‘ ‘ H 

Cosets immediately give rise to the definition of a special class of subgroups, the 
so-called normal subgroups of a group. 

Normal = A) G G Group. 

{H I H <<= G A (V X G (G.<cr>). r_coset G H x = l_coset G x H)}- 

We define the convenient syntax H <| G for (G, H) G Normal. As is apparent 
from the definition, normal subgroups are a special case of subgroups of a group 
where left and right cosets coincide. This is not necessarily the case in non- 
Abelian groups. 

Since the notion of cosets, e.g. r_coset G H a, depends on the binary op- 
eration of the group, they have the additional parameter G. The mathematical 
notation is Ha. We want have to at least a notation like H #> a. 

Locales give us this support. We define a locale for the use of cosets to enable 
convenient syntax for cosets and products. This locale is defined as an extension 
of the locale for groups. In the scope of the locale coset, we can omit the group 
parameter G that is necessary for an adequate formalization and we can define 
local infix syntax. That is, we can write H #> a instead of r_coset G H a, I (H) 
instead of set_inv G H, HI <#> H2 instead of set_prod G HI H2, and {* H *} 
for set jr_cos G H. Logically, the short forms refer to the adequate definitions 
as may be revealed in theorems by export (see Section 2.1). 

The theorems we derive about cosets and the set product of groups are needed 
as a calculational basis for Lagrange’s theorem used in Sylow’s proof (see Section 
4.1) and in the theorems involving the quotient of a group (see Section 3.3). The 
binary operation of groups is lifted to the level of subsets of a group. We derive 
algebraic rules relating the coset operators <# and #> with the product operation 
for subsets <#>. For example, the theorem 

set_prod G (r_coset G H x) (r_coset G H y) = r_coset G H ((G.<f>) x y) 

can be written in the scope of the locale coset as 

(H #> x) <#> (H #> y) = H #> (x # y) 

The advantage is considerable, especially if we consider that the syntax is not 
only important when we type in a goal for the first time, but we are confronted 
with it in each proof step. Hence, the syntactical improvements are crucial for a 
good interaction with the proof assistant. 
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3.2 Sylow’s Theorem 

Sylow’s theorem gives criteria for the existence of subgroups of prime power 
order in finite groups. 

Theorem 1. If G is a group, p a prime and p°‘ divides the order of G then G 
contains a subgroup of order p°‘ . 

In the first mechanization of the theorem [KP99], here referred to as the ad hoc 
version, we were forced to abuse the theory mechanism to achieve readable syntax 
for the main proof. We declared the local constants and definitions as Isabelle 
constants and definitions. To model local rules, we used axioms, i.e. Isabelle 
rules. This works, but contradicts the meaning of axioms and definitions in a 
theory (c/. Section 2). 

Locales offer the ideal support for this procedure and the mechanization is 
methodically sound. In the theory of cosets, we define a locale for the proof of 
Sylow’s theorem. The natural number constants we had to define in [KP99] as 
constants of an Isabelle theory become now locale constants. The names we use 
as abbreviations for larger formulas like the set At = {S' C Gcr \ card{S) = p°‘} 
also become added as locale constants. So, the fixes section of the locale sylow 
is 



locale sylow = coset + 
fixes 

p, a, m : : "nat" 
calM : : " ’ a set set" 

RelM :: "(’a set * ’a set)set" 

The following defines section introduces the local definitions of the set At and 
the relation ~ on AI (here calM and RelM). 

defines 

calM_def "calM == {s | s C (G.<cr>) A card(s) = (p “ a)}" 

RelM_def "RelM == {(N1,N2) | (N1,N2) G calM X calM 

A (3 g e (G.<cr>). N1 = (N2 #> g) )>" 

Note that the previous definitions depend on the locale constants p, a, and 
m (and G from locale group). We can abbreviate in a convenient way using 
locale constants without being forced to parameterize the definitions, i.e. without 
locales we would have to write calM G p a m and RelM G p a m. Furthermore, 
without locales the definitions of calM and RelM would have to be theory level 
definitions — visible everywhere — whereas now they are just local. 

Finally, we add the locale assumptions to the locale sylow. Here, we can 
state all assumption that are local for the 52 theorems of the Sylow proof. In 
the mechanization of the proof without locales in [KP99] all these merely local 
assumptions had to become rules of the theory for Sylow. 

assumes 

Group_G "G G Group" 

prime_p "p G prime" 

card_G " order (G) = (p ~ a) * m" 
finite_G "finite (G.<cr>)" 
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The locale sylow can subsequently be opened to provide just the right context 
to conduct the proof of Sylow’s theorem in the way we discovered in the ad hoc 
approach [KP99] to be appropriate, but now we can define this context soundly. 
In the earlier mechanization of the theorem, we abused the theory mechanisms 
of constants, rules and definitions to that end. Apart from the fact that it is 
meaningless to define local entities globally, we could not use a polymorphic 
type ’a for the base type of groups. Using polymorphism would have led to 
inconsistencies, as we would have assumed the Sylow premises for all groups. 
Hence, we had to use a fixed type, whereby the theorem was not generally 
applicable to groups. Also, we restricted the definition of local proof contexts to 
the scope as outlined above, i.e. to entities that are used in the theorem. Having 
locales, we can extend the encapsulation of local proof context much further than 
in the ad hoc mechanization and closer to the way the paper proof operates. 



More Encapsulation. Now, we may soundly use the locale mechanism for any 
merely locally relevant definition. In particular we can define the abbreviation 

H == {g I g e (G.<cr>) A Ml #> g = Ml} 

for the main object of concern, the Sylow subgroup that is constructed in the 
proof. Naturally, we refrained from using a definition for this set before because 
in the global theorem it is not visible at all, i.e. it is a temporary definition. 
But, by adding the above line to a new locale, after introducing a suitably typed 
locale constant in the fixes part, the proofs for Sylow’s theorem improve a lot. 
A further measure taken now is to define in the new locale the two assumptions 
that are visible in most of the 52 theorems of the proof of Sylow’s theorem. 
Summarizing, a locale for the central part of Sylow’s proof is given by: 

locale sylow_central = sylow + 
fixes 

H : : " ’a set" 

M : : " ’ a set set" 

Ml : : " ’a set" 
assumes 

M_ass "M €E calM / RelM A 

^ (p “ ( (max-n r. p“r|m)+l) | card(M))" 

Ml_ass "Ml e M" 
defines 

H_def "H == {g I g G (G.<cr>) A Ml #> g = Ml}" 

We open this locale after the first few lemmas when we arrive at theorems that 
use the locale assumptions and definitions. Subsequently, we assume that the 
locales group and coset are open. Henceforth, the conjectures become shorter 
and more readable than in the ad hoc version. For example, 

[IMG calM / RelM 

A ^(p ” ((max-n r.p“r|m)+l) | card(M)); 

Ml € M; X G {g I g G (G.<cr>) A Ml #> g = Ml}; 
xa G {g. g G (G.<cr>) A Ml #> g = Ml} |] 

==> X # xa G {g I g G (G.<cr>) A Ml #> g = Ml} 
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can now be stated as 

[I xGH; xaGH |] ==> x # xa G H 

Figure 1 illustrates how the scoping for Sylow’s proof works. Apart from the 

theory Coset 
locale group 

locale coset 

locale sylow 

locale sylow_central 

3 Ml. Ml G M 

3 M. M G calM / RelM A 
^(p “ (max-n r. p “ r | m)+ l)|card(M)) 

H << G A card(H) = p “ a 

^ Export 



3 H. H «= G A card(H) = p ‘ a 




[I ?p G prime; finite (?G. <cr>) ; ?G G Group; 
order(?G) = (?p “ ?a) * ?m |] ==> 3 H. H <<= ?G A card(H) = ?p “ ?a 

Fig. 1. Sylow’s theorem at different levels of nested locales 



main theorem, only two other theorems need to be exported from the innermost 
locale sylow_central. These two theorems prove existence of witnesses for the 
locale assumptions M_ass and Ml_ass and are used to cancel the latter assump- 
tions from the main theorem at the level of locale sylow. When we finally export 
the main theorem from the context of locale sylow using the generally normal- 
izing function export, we achieve the desired form of Sylow’s theorem which 
is independent from any local definitions and assumptions. The theorem stands 
alone as a global theorem of the Isabelle theory Coset, and is hence applicable to 
any group. This becomes visible in the resulting theorem by the question marks 
indicating schematic variables. 

There is another feature of locales that we used and that is particularly 
decisive for the combination of locales and dependent types: a slightly changed 
polymorphism . 



Adapted Polymorphism. The declarations of locale constants may use poly- 
morphism, as seen in most of the examples so far, but this is different to the 
one usual in Isabelle. Usually, Isabelle’s polymorphic declarations are completely 
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independent of each other, e.g. if the same type variable ’ a is used in two dec- 
larations, these constants may be still instantiated to different types. In locales, 
we enrich the expressiveness of polymorphic definitions by extending the scope 
of the polymorphic variable names over all constant declarations of a locale. This 
changes the usual polymorphism of Isabelle, in that equal names imply the same 
variable. That is, polymorphic variables are fixed by the variable names, e.g. ’ a, 
inside the locale. Although the locale as an entity can still be instantiated to 
arbitrary constant types of appropriate sort, the instantiation is implicitly forced 
to be the same for all constants of a locale that use the same variable name. 

This restriction only holds if the same names are used. Naturally we preserve 
the same freedom of expressiveness that was there before: if we use different 
variable names in polymorphic declarations of locale constants, they can be 
instantiated independently. An example is the use of two different groups for 
the construction of the direct product of groups (see Section 2). However, in the 
Sylow case study — and in abstract algebraic applications in general — this is 
exactly what we need: we want to constrain different constructors to the same 
type, while we still want to stay abstract, i.e. use polymorphic declarations. 
Most of the locales defined for the examples of this paper, e.g. groups, use 
one polymorphic type variable ’ a in different locale constant declarations, while 
referring to constituents of one structure. That is, they are abstract, but the same 
type. This “connected” form of polymorphic declaration reflects the connection 
that is there in the dependent type corresponding to the locale. For example, 
the fact that the constituents of a Group element are ranging over the same 
base type, needs to be reflected to the polymorphic type ’a in the constant 
declarations of locale groups. 

3.3 Quotient of a Group 

If a group is factorized by one of its normal subgroups then the quotient together 
with the induced operations on the cosets is again a group. This is a quite 
standard result of group theory, but it is challenging because it contains a self- 
reference: a structure constructed from a group shall be a group again. The 
quotient of a group illustrates the need for structures as dependent types, and 
hence first class citizens. In addition, we will analyze to what extent locales can 
be helpful. 

For this proof we define a new theory that builds on the theory of cosets. 
The factorization of a group by one of its normal subgroups is given by the set of 
cosets. The operations on the cosets are described by the group operations lifted 
to the level of cosets, i.e. the binary operation is given by the product of cosets, 
the inverse operation is given by the inverse coset, and the factor of the quotient 
serves as a unit element, a normal subgroup H . To describe this construction 
formally, we use our typed A-notation (see Section 2.2). 

FactGroup = 

A G G Group. A H G Normal J, G. 

( carrier = set_r_cos G H, 

bin_op = A X G set_r_cos G H. 
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A Y G set_r_cos G H. set_prod G X Y, 
inverse = A X G set_r_cos G H. set_inv G X, 
unit = H I) 



We define the theory syntax G Mod H for the quotient FactGroup G H. To en- 
hance the readability of the construction and thereby the proofs about it, we 
employ locales. We cannot use any nicer syntax in the above definition of the 
quotient because in the body of the A-term above, the terms G and H are pa- 
rameters. Hence, they have to stay flexible. However, using locales we can fix a 
group G and a normal subgroup H in G for the local proof context. 

locale factgroup = coset + 
fixes 

F :: "(’a set) grouptype" 

H : : "(’a set)" 
assumes 

H_ass "H <1 G" 
defines 

F_def "F == FactGroup G H" 

By defining this locale as an extension of the locale coset, we incorporate all 
the syntactical abbreviations we defined for cosets and operations on cosets in 
Section 1. In addition, we have the group G already as a fixed local constant. 
The additional definition of the quotient as F lets us derive in the scope of this 
locale^ 

F = d carrier = {* H *}, 

bin_op = (A X G {* H *}. A Y G {* H *}. X <#> Y) , 
inverse = (A X G {* H *}. I(X)), 
unit = H I) 



The derivation is an application of Isabelle’s simplifier to the corresponding 
definitions, and the reduction rules for A. By the additional use of the locale 
properties of fixing and local definition, we achieve a readable syntax in a local 
scope. 

With these preparations, we can prove that this quotient is again a group, 
which is trivially stated as F G Group in the scope of the locale. The proof 
is straightforward. By backward resolution with the introduction rule for the 
group property Groupl we can reduce it to six subgoals that can be solved by 
repeatedly applying previously derived results about cosets and the operations on 
them. Note that here the initial application of Groupl illustrates the advantage 
we gain through the normalization performed by export. Although we proved 
the rule Groupl for the fixed group G we can now apply it again to the group F 
which is even constructed with that same G. 

By exporting the result that F is a group we get the general formula 

[| ?G G Group; ?H < I ?G |] ==> ?G Mod ?H G Group 

^ Opening factgroup antomatically opens coset and group. 




110 



Florian Kammiiller 



In an earlier version of this experiment, we did not employ locales. The statement 
of the conjecture was even without locales not such a problem; it corresponded 
to the above formula. But, in the proof of the group property, where all the 
definitions of the lifted operations have to be employed, we were formerly exposed 
to formulas that were hard to read. 

As a further illustration of the concept of higher order structures, we consider 
the proof that the constructed A-term FactGroup is an element of a suitable U- 
set. 

FactGroup G (J7 G G Group. (Normal f G) -> Group) 

This membership statement is equivalent to the structural proposition that the 
quotient of a group is a function mapping a group and a normal subgroup of this 
group to another group. We call this kind of theorem a structural proposition 
because membership in a structure (a set) entails the proposition we just proved, 
i.e. that the quotient is a group. If we interpret the sets that are our structures as 
types, then we see how the Curry-Howard isomorphism of propositions-as-types 
[How80] is embodied in a statement like above. In contrast to type theory, we 
do not need to state this isomorphism as a paradigm — it is inherent because 
we use sets: from the above we can derive the logical proposition. 



Further Examples and Analysis. Other theorems we mechanized [Kam99a] 
but cannot present here due to space limitations are: 

— the direct product of two groups is again a group 

— the set of bijections with the appropriate operations of composition of bijec- 
tions, inverse bijection, and identical bijection forms a group 

— the automorphisms of a ring form a group 

— the full version of Tarski’s fixed point theorem, i.e. the classical theorem with 
the addition that the set of all fixed points of the continuous function / is 
itself a complete lattice. 

As with the quotient of groups, we first performed the mechanization without 
the use of locales^. In comparison, we could reduce the size of the proofs by 50% 
using locales. Although in the latter version some savings are due to polishing 
the proofs by improving the applications of automatic simplification tactics, a 
larger portion is due to locales. Furthermore, the streamlining of the proofs was 
made much easier because of the greatly improved comprehensibility. Where we 
were lost before in grasping huge complicated terms, and thus sometimes misled 
from the optimum solution, the natural representation achieved by locales leads 
the way now. Locales allow us to use the same local definition and assumptions 
as in a module. At the same time the structures, like groups and rings, are 
first class citizens, whereby we achieve adequacy. Through the combination with 
locales the higher complexity of the adequate formalizations is balanced out. 

Hence, the combination of locales and dependent types adds up to modular 
reasoning. Since locales are reflected into the meta-logic (see Section 2.1), this 

® At the time the implementation was not capable of dealing with nested locales. 




Modular Reasoning in Isabelle 111 



combination does not introduce inconsistencies and enables reuse of locales as 
we will see in the following section. 

3.4 Operations on Modules 

Through the embedding of structures as if-types and 7T-types, we achieve first 
class representations of modules. Thereby, we are able to use structures in for- 
mulas. Moreover, we illustrate in the present section that we can express general 
operations on structures such as forgetful functors. We show how the substruc- 
ture of a ring that is an Abelian group can be revealed. For this example we 
first have to explain how we formalized rings. Using extension of record types, 
we can build the base type for rings on the base type for groups grouptype. 

record ’a ringtype = ’a grouptype + 

Rmult :: "[’a, ’a] => ’a" 

Thereby, we inherit the components of groups and can form rings by just ex- 
tending the latter by the second operation Rmult^. We add the syntax R.<m> 
for the additional element Rmult of a ring to adapt the notation for rings to the 
syntax of the group projections. 

To isolate the group contained in a ring we can use an element of a U-set. 
This A- function represents a forgetful functor. It “forgets” some structure. 

group_of :: "’a ringtype => ’a grouptype" 

"group_of == A R G Ring. 

d carrier = (R.<cr>), bin_op = (R.<f>), 
inverse = (R.<inv>), unit = (R.<e>) |)" 

Thereby, we are able to refer to the substructure of the ring that forms an 
Abelian group using the forgetful functor group_of . We can derive the theorem® 

R G Ring ==> group_of R G AbelianGroup [R_Abel] 

This enables a better structuring and decomposition of proofs. In particular, we 
can use this functor when we employ locales for ring related proofs. Then we 
want to use the encapsulation already provided for groups by the locale group. 
To achieve this we define the locale for rings as an extension. 

locale ring = group + 
fixes 

R : : " ’ a ringtype" 

rmult :: "[’a, ’a] => ’a" (infixr "**" 80 ) 
assumes 

Ring_R "R G Ring" 
defines 

rmult _def "x ** y == (R.<m>) x y" 

R_id_G "G == group_of R" 

^ Note that we formalize rings without 1. Mathematical textbooks sometimes use the 
notion of rings for rings with a 1 for convenience. 

® AbelianGroup is the structure of Abelian, i.e. commutative, groups. Their definition 
is a simple extension from the one of groups by the additional commutativity. 




112 



Florian Kammiiller 



Note that we are able to use the locale constant G again in a locale definition, 
i.e. R_id_G. This is sound because we have not defined G yet. If one gives a 
constant an inconsistent definition, then one will be unable to instantiate results 
proved in the locale. This way of reusing the local proof context of groups for 
the superstructure of rings illustrates the flexibility of locales as well as the ease 
of integration with the mechanization of structures given by S and U . 

Theorems that are proved in the setup of the locale ring using group results 
will have in the exported form the assumptions 

[| R G Ring; group_of R G Group ... |] ==> . . . 

But, as an implication of the theorem R_Abel, we can easily derive 

R G Ring ==> group_of R G Group 

Thus, the second premise can be cancelled. Although we have to do a final proof 
step to cancel the additional premise, this shows the advantage of locales being 
reflected onto premises of the global representations of theorems: it is impossible 
to introduce unsoundness. A definition of a locale constant that is not consistent 
with its properties stated by locale rules would be not applicable. Since locale 
assumptions and definitions are explained through meta-assumptions, the result- 
ing theorem would carry the inconsistent assumptions implicitly. We see that the 
nested structure of locales is consistent with the logical structure because locales 
are reflected to the meta-logic. Thereby reuse of the locale of groups is possible. 

4 Conclusion 

4.1 Related Work 

The proof of the theorem of Lagrange has been performed with the Boyer Moore 
Prover [Yu90]. E. Gunter formalized group theory in HOL [Gun89]. In the higher 
order logic theorem prover IMPS [FGT93] some portion of abstract algebra in- 
cluding Lagrange is proved. Mizar’s [Try93] library of formalized mathematics 
contains probably more abstract algebra theorems than any other system. How- 
ever, to our knowledge we were the first to mechanically prove Sylow’s first 
theorem. Since it uses Lagrange’s theorem, we had to prove this first. In con- 
trast to the formalization as seen in [Yu90] the form of Lagrange that we need 
for Sylow’s theorem is not just the one stating that the order of the subgroup 
divides the order of the group but instead gives the precise representation of the 
group’s order as the product of order of the subgroup and the index of this sub- 
group in G. Since we have a first class representation of groups, we can express 
this order equation and can use general results about finite sets to reduce it to 
simpler theorems about cosets. Hence, compared to [Yu90] our proof of Lagrange 
is simpler. 

Locales implement a sectioning device similar to that in AUTOMATH [dB80] 
or Goq [Dow90] . In contrast to this kind of sections, locales are defined statically. 
Also, optional pretty printing syntax is part of the concept. The HOL system 
[GM93] has a concept of abstract theories based on Gunter’s experiments with 
abstract algebra [Gun89,Gun90] in parts comparable to locales. 
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4.2 Discussion 

Modules for theorem provers can be considered as a means to represent abstract 
structures. In that case modules need to be first class citizens to enable adequacy. 
Another aspect of modules is the locality they provide by their scoping, which 
is useful — if sometimes not necessary — for shortening and hence readability 
of formulas and proofs. 

The embedding of dependent types combines the expressiveness of type the- 
ories with the convenience and power of higher order logic theorem proving. 
Although the dependent types are only modeled as typed sets of Isabelle/HOL 
we get the “expressive advantage” . In contrast to earlier mechanizations of de- 
pendent types in higher order logic [JM93] our embedding is relatively light- 
weight as it is based on a simple set-theoretic embedding. At the same time the 
n and A-types are strong enough to express higher-level modular notions, like 
mappings between parameterized structures. 

Locales are a general concept for locality. They are not restricted to any 
particular object logic of Isabelle, i.e. they can be used for any kind of reason- 
ing. Although they are not first class citizens, there is the export device that 
reflects locales to meta-logical assumptions, thereby explaining them in terms of 
Isabelle’s meta-logic. 

Locality and adequacy are separate aspects that may sometimes coincide, 
but in general they should be treated individually. We have illustrated this by 
showing how the devices of locales for locality and dependent types for adequacy 
add up to support modular reasoning. The presented case studies contain as well 
aspects that have to be expressed adequately by first class modules given by 
dependent types as ones that needed the structuring and syntactic support of 
locales. Moreover, we have shown that the separation of the concepts does not 
hinder their smooth combination. Where the first class representations become 
too complicated, locales can be used to reduce them. Moreover, in intricate 
combinations like the forgetful functor in Section 3.4 we have seen that the 
reflection of locales to the meta-logic preserves consistency and enables reuse. 

Hence, instead of using one powerful but inadequate concept for modular 
reasoning, like classical modules, we think that locales combined with dependent 
types are appropriate. The separation is tailored for Isabelle, yet it is applicable 
to other theorem provers. Since the difference between adequacy and locality is 
a general theoretical issue, the conceptual design of a combination of two devices 
for the support of modular reasoning presented in this paper is of more general 
interest. 
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Abstract. The little theories method, in which mathematical reason- 
ing is distributed across a network of theories, is a powerful technique 
for describing and analyzing complex systems. This paper presents an 
infrastructure for intertheory reasoning that can support applications 
of the little theories method. The infrastructure includes machinery to 
store theories and theory interpretations, to store known theorems of a 
theory with the theory, and to make definitions in a theory by extending 
the theory “in place” . The infrastructure is an extension of the interthe- 
ory infrastructure employed in the imps Interactive Mathematical Proof 
System. 



1 Introduction 

Mathematical reasoning is always performed within some context, which includes 
vocabulary and notation for expressing concepts and assertions, and axioms and 
inference rules for proving conjectures. In informal mathematical reasoning, the 
context is almost entirely implicit. In fact, substantial mathematical training is 
often needed to “see” the context. 

The situation is quite different in formal mathematics performed in logical 
systems often with the aid of computers. The context is formalized as a math- 
ematical structure. The favored mathematical structure for this purpose is an 
axiomatic theory within a formal logic. An axiomatic theory, or theory for short, 
consists of a formal language and a set of axioms expressed in the language. It 
is a specification of a set of objects: the language provides names for the objects 
and the axioms constrain what properties the objects have. 

Sophisticated mathematical reasoning usually involves several related but 
different mathematical contexts. There are two main ways of dealing with a 
multitude of contexts using theories. The big theory method is to choose a highly 
expressive theory — often based on set theory or type theory — that can represent 
many different contexts. Each context that arises is represented in the theory or 
in an extension of the theory. Contexts are related to each other in the theory 
itself. 

An alternate approach is the little theories method in which separate contexts 
are represented by separate theories. Structural relationships between contexts 
are represented as interpretations between theories (see [4,19]). Interpretations 
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serve as conduits for passing information (e.g., definitions and theorems) from 
abstract theories to more concrete theories, or indeed to other equally abstract 
theories. As a result, the big theory is replaced with a network of theories — 
which can include both small compact theories and large powerful theories. The 
little theories approach has been used in both mathematics and computer science 
(see [10] for references). In [10] we argue that the little theories method offers 
important advantages for mechanized mathematics. Many of these advantages 
have been demonstrated by the imps Interactive Mathematical Proof System 
[9,11] which supports the little theories method. 

A mechanized mathematics system based on the little theories method re- 
quires a different infrastructure than one based on the big theory method. In the 
big theory method all reasoning is performed within a single theory, while in the 
little theories method there is both intertheory and intratheory reasoning. This 
paper presents an infrastructure for intertheory reasoning that can be employed 
in several kinds of mechanized mathematics systems including theorem provers, 
software specification and verification systems, computer algebra systems, and 
electronic mathematics libraries. The infrastructure is closely related to the in- 
tertheory infrastructure used in imps, but it includes some capabilities which are 
not provided by the imps intertheory infrastructure. 

The little theories method is a major element in the design of several software 
specification systems including ehdm [18], iota [16], kids [20], OBj3 [12], and 
Specware [21]. The intertheory infrastructures of these systems are mainly for 
constructing theories and linking them together into a network. They do not sup- 
port the rich interplay of making definitions, proving theorems, and “transport- 
ing” definitions and theorems from one theory to another needed for developing 
and exploring theories within a network. 

The Ergo [17] theorem proving system is another theorem proving system be- 
sides IMPS that directly supports the little theories method.^ Its infrastructure 
for intertheory reasoning provides full support for constructing theories from 
other theories via inclusion and interpretation but only partial support for de- 
veloping theories by making definitions and proving theorems. In Ergo, theory 
interpretation is static: theorems from the source theory of an interpretation can 
be transported to the target theory of the interpretation only when the inter- 
pretation is created [14]. Theory interpretation is dynamic in the intertheory 
infrastructure of this paper (and of imps). 

The rest of the paper is organized as follows. The underlying logic of the 
intertheory infrastructure is given in section 2. Section 3 discusses the design 
requirements for the infrastructure. The infrastructure itself is presented in sec- 
tion 4. Finally, some applications of the infrastructure are described in section 5. 



^ Many theorem proving systems indirectly support the little theories methods by 
allowing a network of theories to be formalized within a big theory. 
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2 The Underlying Logic 

The intertheory infrastructure presented in this paper assumes an underlying 
logic. Many formal systems, including first-order logic and Zermelo-Fraenkel set 
theory, could serve as the underlying logic. For the sake of convenience and 
precision, we have chosen a specific underlying logic for the infrastructure rather 
than treating the underlying logic as a parameter. Our choice is Church’s simple 
theory of types [3], denoted in this paper by C. 

The underlying logics of many theorem proving systems are based on C. For 
example, the underlying logic of imps (and its intertheory infrastructure) is a 
version of C called lutins [5,6,8]. Unlike C, lutins admits undefined terms, 
partial functions, and subtypes. By virtue of its support for partial functions and 
subtypes, many theory interpretations can be expressed more directly in lutins 
than in C [8] . 

We will give now a brief presentation of C. The missing details can be filled 
in by consulting Church’s original paper [3] or one of the logic textbooks, such 
as [1], which contains a full presentation of C. We will also define a number 
of logical notions in the context of C including the notions of a theory and an 
interpretation. 



2.1 Syntax of C 

The types of C are defined inductively as follows: 

1. t is a type (which denotes the type of individuals). 

2. * is a type (which denotes the type of truth values). 

3. If a and j3 are types, then {a ^ /3) is a type (which denotes the type of total 
functions that map values of type a to values of type (3). 

Let T denote the set of types of C. 

A tagged symbol is a symbol tagged with a member of T. A tagged symbol 
whose symbol is a and whose tag is a is written as Oq. Let V be a set of tagged 
symbols called variables such that, for each a G T, the set of members of V 
tagged with a is countably infinite. A constant is a tagged symbol Cq such that 
Co ^ V. 

A language L of C is a set of constants. (In the following, let a “language” 
mean a “language of C” . ) An expression of type a of L is a finite sequence of 
symbols defined inductively as follows: 

1. Each Oq G V U L is an expression of type a. 

2. If A is an expression of type a P and A is an expression of type a, then 
F{A) is an expression of type p. 

3. If Xa &V and E is an expression of type P, then {\Xa ■ E) is an expression 
of type a ^ p. 

4. If A and B are expressions of type a, then (A = B) is an expression of 
type *. 
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5. If A and B are expressions of type *, then ^A, {A D B), (AAB), and {AW B) 
are expressions of type *. 

6. If Xa & V and E is an expression of type *, then (Va;^ ■ E) and { 3 xa ■ E) 
are expressions of type *. 

Expressions of type a are denoted by Aa, Ba^Ca, etc. Let £l denote the set of 
expressions of L. “Free variable”, “closed expression”, and similar notions are 
defined in the obvious way. Let Sl denote the set of sentences of L, i.e., the set 
of closed expressions of type * of L. 

2.2 Semantics of C 

For each language L, there is a set Ml of models and a relation \= between 
models and sentences of L. M \= A^. is read as “M is a model of A£’ . Let L 
be a language, G Sl, E C Sl, and M & Ml- M is & model of E, written 
M \= E, A M \= for all B^ & E. E logically implies written E ^ yl*, if 
every model of F is a model of A^, . 

2.3 Theories 

A theory of C is a pair T = {L, E) where L is a language and E C- Sl- E serves 
as the set of axioms of T. (In the following, let a “theory” mean a “theory of C” .) 
A* is a (semantic) theorem of T, written T ^ A*, if F ^ A*. T is consistent if 
some sentence of L is not a theorem of T. A theory T' = (F', F') is an extension 
of T, written T < F', if F C F' and F C F'. T' is a conservative extension of F, 
written T < F', if T <T' and, for all A* G Sl, if T' \= A*, then F ^ A*. 

The following lemma about theory extensions is easy to prove. 

Lemma 1. Let Ti, T2, and T3 be theories. 

E If El < T2 S F3, then T\ < F3. 

2 - If 7i < F2 < F3, then T\ < F3. 

3 - If El <T2 < F3 and E\ < F3, then E\ < F2. 

-i- If Ti < F2 and E\ is consistent, then E2 is consistent. 

2.4 Interpretations 

Let T = (F, F) and E' = (F', F') be theories, and let E> = (7, /i, v) where 7 G T 
and /i : V — > V and v : L ^ £ l> are total functions. 

For a G T, <P{a) is defined inductively as follows: 

1. = 7. 

2 . <L{*) = *. 

3. If a, (3 G E, then <!>{a ^ ( 3 ) = <P{a) — > ^(/3). 
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<P is a, translation from L to L' if: 

1. For all Xa G V, fi{xa) is of type (P{a). 

2. For all Cq G L, v{ca) is of type 

Suppose <P is a translation from L to L' . For Ea G El, '^{Ea) is the member 
of El' defined inductively as follows: 

1. If Ea, G V, then ^E^) = ^i{Ea,). 

2. If Ea G F, then (!>{Ea) = v{Ea)- 

3. ^{Ea^/siAa)) = ^{Ea^/3)mAa)). 

4. <P{XXa- Ep) = {X-P{xa).-P{Ep)). 

5. ^Aa = Ba) = mAa) = ^{Ba)) 

6 . = ~^^{E^). 

7. = (^(A*) □ <^{BA)) where □ G {D,A,V}. 

8. <!>{'^Xa ■ EA) = {'^<!>{xa) ■ ^{EA) where □ G {V, 3}. 

is an interpretation of T in T' if it is a translation from L to L' that maps 
theorems to theorems, i.e., for all A^, G Sl, H T \= A*, then T' ^ 

Theorem 1 (Relative Consistency). Suppose be an interpretation ofT in 
T' and T' is consistent. Then T is consistent. 

Proof. Assume <l> = ( 7 , pL, v) is an interpretation of T in T' , T' is consistent, 
and T is inconsistent. Then F* = {3xi. ~^{xi = x^)) is a theorem of T, and so 
d>(EA = (3^(a:t) . ^{p.{xf) = p.{xf))) is a theorem of T', which contradicts the 
consistency of T' . □ 

The next theorem gives a sufficient condition for a translation to be an in- 
terpretation. 

Theorem 2 (Interpretation Theorem). Suppose <P is a translation from L 
to L' and, for all A* G F, T' ^ <?(A*). Then T> is an interpretation of T in T' . 

Proof. The proof is similar to the proof of Theorem 12.4 in [6].D 



3 Design Requirements 

At a minimum, an infrastructure for intertheory reasoning should provide the 
capabilities to store theories and interpretations and to record theorems as they 
are discovered. We present in this section a “naive” intertheory infrastructure 
with just these capabilities. We then show that the naive infrastructure lacks 
several important capabilities. From these results we formulate the requirements 
that an intertheory infrastructure should satisfy. 
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3.1 A Naive Intertheory Infrastructure 

We present now a naive intertheory infrastructure. In this design, the state of 
the infrastructure is a set of infrastructure objects. The infrastructure state is 
initially the empty set. It is changed by the application of infrastructure opera- 
tions which add new objects to the state or modify objects already in the state. 
There are three kinds of infrastructure objects for storing theories, theorems, 
and interpretations, respectively, and there are four infrastructure operations 
for creating the three kinds of objects and for “installing” theorems in theories. 

Infrastructure objects are denoted by boldface letters. The three infrastruc- 
ture objects are defined simultaneously as follows: 

1. A theory object is a tuple T = (n, L, T, S) where n is a string, L is a language, 
r C Sl, and A is a set of theorem objects, n is called the name of T and is 
denoted by [T]. (L, F) is called the theory of T and is denoted by thy(T). 

2. A theorem object is a tuple A = ([T],A», J) where T = {n,L,F,S) is a 
theory object. A* G Sl, and J is a justification^ that thy(T) ^ A*. 

3. An interpretation object is a tuple I = ([T], [T'],<?, J) where T and T' 
are theory objects, <P is a translation, and J is a justification that is an 
interpretation of thy(T) in thy(T^). 

Let S denote the infrastructure state. The four infrastructure operations are 
defined as follows: 

1. Given a string n, a language L, and F C Sl as input, if, for all theory objects 
T' = {n' , L' , F' , S') € S, n n' and thy(T) yf thy(T'), then create-thy-obj 
adds the theory object (n, L, F, 0) to S; otherwise, the operation fails. 

2. Given a theory object T G S, a sentence A*, and a justification J as input, 
if A = ([T],A*, J) is a theorem object, then create-thm-obj adds A to S; 
otherwise, the operation fails. 

3. Given two theory objects T, T' G S, a translation <P, and a justification J as 
input, if I = ([T], [T'], J) is an interpretation object, then create-int-obj 
adds I to S; otherwise, the operation fails. 

4. Given a theorem object A = ([T],A», J) G S and a theory object T' = 
(n', L' , F', S') G S as input, if thy(T) < thy(T^), then install-thm-obj replaces 
T' in S with the theory object (n', L' , F', A'U{A}); otherwise, the operation 
fails. 



2 



The notion of a justification is not specified. It could, for example, be a formal proof. 
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3.2 Missing Capabilities 

The naive infrastructure is missing four important capabilities: 



A. Definitions. Suppose we would like to make a definition that the constant 
is_zerOt^» is the predicate {Xx^ . = OJ in a theory T stored in a theory object 

T = {n, L, r, S) G S. The naive infrastructure offers only one way to do this: 
create the extension T' = {L' , F') of T, where L' = LU {is_zerOt^*} and 

T' = TU {is_zero^^* = (\x,. x, = OJ}, 

and then store T' in a new theory object T' by invoking create-thy-obj. If 
is_zerOt^* is not in L, T and T' can be regarded as the same theory since T ^T' 
and is_zerOt^* can be “eliminated” from any expression of V by replacing every 
occurrence of it with {Xx^ ■ Xc = Ot). 

Definitions are made all the time in mathematics, and thus, implementing 
definitions in this way will lead to an explosion of theory objects storing theories 
that are essentially the same. A better way of implementing definitions would be 
to extend T to T' “in place” by replacing T in T with T' . The resulting object 
would still be a theory object because every theorem of T is also a theorem of 
T'. 

This approach, however, would introduce a new problem. If an interpretation 
object I = ([T], [T'], J) G S and thy(T) is extended in place by making a def- 
inition Cq, = Ea, then the interpretation <P would no longer be an interpretation 
of T in T' because <?(cq) would not be defined. 

There are three basic solutions to this problem. The first one is to auto- 
matically extend to an interpretation of T in T' by defining 'P{ca) = <P{Ea). 
However, this solution has the disadvantage that, when an expression of T con- 
taining Ca is translated to an expression of T' via the extended the expression 
of T will be expanded into a possibly much bigger expression of T' . 

The second solution is to automatically transport the definition Cq = E^ 
from T to & T' via by making a new definition of the form = <l>{Ea) in 
T' and defining = df}. The implementation of this solution would require 

care because, when two similar theories are both interpreted in a third theory, 
common definitions in the source theories may be transported multiple times to 
the target theory, resulting in definitions in the target theory that define different 
constants in exactly the same way. 

The final solution is to let the user extend by hand whenever it is necessary. 
This solution is more flexible than the first two solutions, but it would impose a 
heavy burden on the user. Our experience in developing imps suggests that the 
best solution would be some combination of these three basic solutions. 



B. Profiles. Suppose we would like to make a “definition” that the constant 
a_non_zerOt has a value not equal to Ot in a theory T stored in a theory object 
T = (n, L,r, E) G S. That is, we would like to add a new constant a_non_zerOt 
to L whose value is specified, but not necessarily uniquely determined, by the 
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sentence ^(a_non_zerOt = Ot). More precisely, let T' = {L' ,F') where L' = LU 
{a_non_zerOt} and 

r' = r U {^(a_non_zerOt = Ot)}. 

If a_non_zerOt is not in L and the sentence (3xt . —'{x^ = Ot)) is a theorem of T, 
then T <T' . 

We call definitions of this kind profiles.^ A profile introduces a finite number 
of new constants that satisfy a given property. Like ordinary definitions, profiles 
produce conservative extensions, but unlike ordinary definitions, the constants 
introduced by a profile cannot generally be eliminated. A profile can be viewed 
as a generalization of a definition since any definition can be expressed as a 
profile. 

Profiles are very useful for introducing new machinery into a theory. For ex- 
ample, a profile can be used to introduce a collection of objects plus a set of 
operations on the objects — what is called an “algebra” in mathematics and an 
“abstract datatype” in computer science. The new machinery will not compro- 
mise the original machinery of T because the resulting extension T' of T will be 
conservative. Since T' is a conservative extension of T, any reasoning performed 
in T could just as well have been performed in T' . Thus the availability of T' 
normally makes T obsolete. 

Making profiles in the naive infrastructure leads to theory objects which store 
obsolete theories. The way of implementing definitions by extending theories in 
place would work just as well for profiles. As with definitions, extending theories 
in place could cause some interpretations to break. A combination of the second 
and third basic solutions to the problem given above for definitions could be 
used for profiles. The first basic solution is not applicable because profiles do 
not generally have the eliminability property of definitions. 



C. Theory Extensions. Suppose that S contains two theory objects T and 
T' with thy(T) < thy(T^). In most cases (but not all), one would want every 
theorem object installed in T to also be installed in Tb The naive infrastructure 
does not have this capability. That is, there is no support for having theorem 
objects installed in a theory object to automatically be installed in preselected 
extensions of the theory object. An intertheory infrastructure should guarantee 
that, for each theory object T and each preselected extension of T, every 
theorem, definition, and profile installed in T is also installed in T'. 



D. Theory Copies. The naive infrastructure does not allow the infrastructure 
state to contain two theory objects storing the same theory. As a consequence, it 
is not possible to add a copy of a theory object to the infrastructure state. We will 
see in section 5 that creating copies of a theory object is a useful modularization 
technique. 

® Profiles are called constant specifications in [13] and constraints in [15]. 
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3.3 Requirements 

Our analysis of the naive intertheory infrastructure suggests that the intertheory 
infrastructure should satisfy the following requirements: 

R1 The infrastructure enables theories and interpretations to he stored. 

R2 Known theorems of a theory can he stored with the theory. 

R3 Definitions can he made in a theory hy extending the theory in place. 

R4 Profiles can be made in a theory by extending the theory in place. 

R5 Theorems, definitions, and profiles installed in a theory are automatically 
installed in certain preselected extensions of the theory. 

R6 An interpretation ofTi in T 2 can be extended in place to an interpretation 
of T{ in T 2 if Ti is extended to T^ by definitions or profiles for i = 1,2. 

R7 A copy of a stored theory can be created and then developed independently 
from the original theory. 

The naive infrastructure satisfies only requirements R1 and R2. The imps 
intertheory infrastructure satisfies all of the requirements except R4 and R7. 



4 The Intertheory Infrastructure 

This section presents an intertheory infrastructure that satisfies all seven require- 
ments in section 3.3. It is the same as the naive infrastructure except that the 
infrastructure objects and operations are different. That is, the infrastructure 
state is a set of infrastructure objects, is initially the empty set, and is changed 
by the application of infrastructure operations which add new objects to the 
state or modify objects already in the state. As in the naive infrastructure, let 

5 denote the infrastructure state. 



4.1 Objects 

There are five kinds of infrastructure objects. The first four are defined simulta- 
neously as follows: 

1. A theory object is a tuple T = (n, Lq, Tq, L, T, A, a, AT) where: 

(a) n is a string called the name of T. It is denoted by [T]. 

(b) Lq and L are languages such that Lq C L. Lq and L are called the base 
language and the current language of T, respectively. 

(c) To C Slo and T C Sl with Tq C T. The members of Tq and T are 
called the base axioms and the current axioms of T, respectively. 

(d) T C A C {A* G : T [= A*}. The members of A are called the known 
theorems of T, and A is denoted by thms(T). 
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(e) (T is a finite sequence of theorem, definition, and profile objects called 
the event history of T. 

(f) Af is a set of names of theory objects called the principal subtheories of 

T. For each [T'] G Af with T' = (n', Lf,, L' , F', A', K ^ ^o, 

n C Fo, F' CL, rc F, Z\' C A, and a' is a subsequence of a. 

The base theory of T is the theory (Fo,Fo) and the current theory of T, 
written thy(T), is the theory (L,r). 

2. A theorem object is a tuple A = ([T], A*, J) where: 

(a) T is a theory object with thy(T) = (F, F). 

(b) Af, & Sl- Af, is called the theorem of A. 

(c) J is a justification that F j= A*. 

3. A definition object is a tuple D = ([T], Cq, Ea, J) where: 

(a) T is a theory object with thy(T) = (F, F). 

(b) Ca is a constant not in F. 

(c) Ea G £l- Ca = Ea is Called the defining axiom of D. 

(d) J is a justification that E \= Of, where O* is (3 xq . Xa = Ea) and Xa 
does not occur in FqA O* is called the obligation of D. 

4. A profile object is a tuple P = ([T],C, F^, J) where: 

(a) T is a theory object with thy(T) = (F, F). 

(b) C = {Ca ^ , . . . , c™^} is a set of constants not in F. 

(c) E/s = (A Xa,^ • • • A . F*) where , . . . , are distinct variables. 

is called the profiling axiom of P. 

(d) J is a justification that E \= Of, where O* is (3 a;^^ • • • 3 a;™^ . F*). O* is 
called the obligation of P. 

An event object is a theorem, definition, or profile object. 

Let T < T' mean thy(T) < thy(T^) and T < T' mean thy(T) <thy(T^). T is 
a structural subtheory of T' if one of the following is true: 

1. T = T'. 

2. T is a structural subtheory of a principal subtheory of T'. 

T is a structural supertheory of if is a structural subtheory of T. 

For a theory object T = (n, Fq, Fq, F, F, A, a,Nj and an event object e whose 
justification is correct, T[e] is the theory object defined as follows: 

1. Let e be a theorem object ([T'], A*, J). If T' < T, then 

T[e] = (n,Fo,Fo,F,F, AU {A*},cr'(e),A/'); 
otherwise, T[e] is undefined. 

2. Let e be a definition object ([T^], Cq, Ea, J)- If T' < T and Ca ^ F, then 

T[e] = (n, Fo,Fo, FU {ca},FU {A*}, AU {A*}, (j'(e), Af) 

where A* is the defining axiom of e; otherwise, T[e] is undefined. 

^ In C, F 1= (3 Xa . Xa = Ea) always holds and so no justification is needed, but in 
other logics such as lutins a justification is needed since F |= (3 Xa ■ Xa = Ea) will 
not hold if Ea is undefined. 
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3. Let e be a profile object ([T'],C, Ep, J). If T' < T and C n L = 0, then 

T[e] = (n, Lo, /"o, C, i" U {^*}, A U {A*}, <t' ( e), Af) 

where A* is the profiling axiom of e; otherwise, T[e] is undefined. 

An event history a is correct if the justification in each member of a is correct. 
For a correct event history a, T[(t] is defined by: 

1. Let a = (). Then T[ct] = T. 

2. Let a = a'"{e). If (T[(j'])[e] is defined, then T[(t] = (T[(j'])[e]; otherwise, 
T[(t] is undefined. 

Let the base of T, written base(T), be the theory object 
(n_base, Lq, Eq, Lq, Eq, Eq, (), 0). 

T is proper if the following conditions are satisfied: 

1. Its event history ct is correct. 

2. thy(T) = thy(base(T)[(j]). 

3. thms(T) = thms(base(T)[(j]). 

Lemma 2. If T is a proper theory object, then A* is a known theorem of T iff 
A* is a base axiom of T or a theorem, defining axiom, or profiling axiom of an 
event object in the event history of T. 

Proof. Follows immediately from the definitions above. 

Theorem 3. If T is a proper theory object, then base( T) < T. 

Proof. Since T is proper, the event history ct of T is correct and thy(T) = 
thy(base(T)[(r]). We will show base(T) < T by induction on \a\, the length of a. 

Basis. Assume \a\ = 0. Then thy(T) = thy(base(T)) and so base(T) < T is 
obviously true. 

Induction step. Assume \a\ > 0. Suppose a = a'"{e). By the induction hy- 
pothesis, base(T) < base(T)[<T']. We claim base(T)[<T'] < (base(T)[cr'])[e]. If e 
is a theorem object, then clearly thy(base(T)[<T']) = thy((base(T)[(r'])[e]) and 
so base(T)[(r'] < (base(T)[(r'])[e]. If e is a definition or profile object, then 
base(T)[(j'] < (base(T)[(j'])[e] by the justification of e. Therefore, base(T) < T 
follows by part (2) of Lemma 1. □ 

We will now define the fifth and last infrastructure object: An interpretation 
object is a tuple I = ([T], [T'], <?, J) where: 

1. T is a theory object called the source theory of I. 

2. T' is a theory object called the target theory of I. 

3. is a translation. 

4. J is a justification that is an interpretation of thy(base(T)[(r]) in 
thy(base(T')[<T']) where a and a' are initial segments of the event histories 
of T and T', respectively. 
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4.2 Operations 

The infrastructure design includes ten operations. 

There are operations for creating the infrastructure objects: 

1. Given a string n, a language L, a set F of sentences, and theory objects 
T* = {n\Li,r^,L\r\A\a\N") G S for i = 1, . . . , m as input, let 

(a) T'o = LiU---UL^. 

(b) T' = Tpi U • • • u r^. 

(c) L' = 

(d) T' = U • • • U T™. 

(e) Z\' = Z\i U • --U A"^. 

(f) cr' = (jl' ••• V™. 

If 



T = (n, L U L'o, T U L U L', T U F', F U Z\', a', {[Ti], . . . , [T„]}) 

is a theory object and n yf [T'] for any theory object T' G S, then 
create-thy-obj adds T to S; otherwise, the operation fails. 

2. Given a theory object T G S, a sentence A^, and a justification J as input, 
if A = ([T],A*, J) is a theorem object, then create-thm-obj adds A to S; 
otherwise, the operation fails. 

3. Given a theory object T G S, a constant Cq, an expression and a jus- 
tification J as input, if D = ([T], Cq, ifa, J) is a definition object, then 
create-def-obj adds D to S; otherwise, the operation fails. 

4. Given a theory object T G S, a set C of constants, an expression and 
a justification J as input, if P = ([T],C, if/j, J) is a profile object, then 
create-pro-obj adds P to S; otherwise, the operation fails. 

5. Given two theory objects T, T' G S, a translation and a justification J as 
input, if I = ([T], [T^], J) is an interpretation object, then create-int-obj 
adds I to S; otherwise, the operation fails. 

There are operations for installing theorem, definition, and profile objects in 
theory objects: 

1. Given a theorem object A = ([Tq], A*, J) G S and a theory object Ti G S, 
if To < Ti, then install-thm-obj replaces every structural supertheory T of 
Ti in S with T[A]; otherwise, the operation fails. 

2. Given a definition object D = ([Tq], Cq, ifa, J) G S and a theory object 
Ti G S, if To < Ti and T[D] is defined for every structural supertheory 
T of Ti in S, then install-def-obj replaces every structural supertheory T of 
Ti in S with T[D]; otherwise, the operation fails. 

3. Given a profile object P = ([To],C, Ejj, J) G S and a theory object Ti G S, 
if To < Ti and T[P] is defined for every structural supertheory T of Ti 
in S, then install-pro-obj replaces every structural supertheory T of Ti in S 
with T[P]; otherwise, the operation fails. 
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There are operations to extend an interpretation object and to copy a theory 
object: 

1. Given an interpretation object I = ([T], [T'], J) G S, a translation <P', and 

a justification J' as input, if extends and l' = ([T], J') is an 

interpretation object, then extend-int replaces I in S with I'; otherwise, the 
operation fails. 

2. Given a string n and a theory object 

= g s 

as input, if n yf [T^] for any theory object T' G S, then create-thy-copy adds 
the theory object 

T' = (n, Lo, ro, L, r, A, a,N) 
to S; otherwise, the operation fails. 

The infrastructure operations guarantee that the following theorem holds: 
Theorem 4. If the justification of every event object in S is correct, then: 

1. Every object in S is a well-defined theory, theorem, definition, profile, or 
interpretation object. 

2. Every theory object in S is proper. 

3. Distinct theory objects in S have distinct names. 



Some Remarks about the Intertheory Infrastructure: 

1. Theory and interpretation objects are modifiable, but event objects are not. 

2. The event history of a theory object records how the theory object is con- 
structed from its base theory. 

3. The theory stored in a theory object T extends all the theories stored in the 
principal subtheories of T. 

4. Theorem, definition, and profile objects installed in a theory T in S are 
automatically installed in every structural supertheory of T in S. 

5. The infrastructure allows definitions and profiles to be made in a theory 
object T both by modifying T using install-def-obj and install-prof-obj and 
by creating an extension of T using create-thy-obj. 

6. By Theorem 2, if is a translation from thy(T) to thy(T^) which maps the 
base axioms of T to known theorems of T', then <I> is an interpretation of 
thy(T) in thy(T'). 

7. The interpretation stored in an interpretation object is allowed to be incom- 
plete. It can be extended as needed using extend-int. 
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5 Some Applications 

5.1 Theory Development System 

The intertheory infrastructure provides a strong foundation on which to build a 
system for developing axiomatic theories. The infrastructure operations enable 
theories and interpretations to be created and extended. Many additional opera- 
tions can be built on top of the ten infrastructure operations. Examples include 
operations for transporting theorems, definitions, and profiles from one theory 
to another and for instantiating theories. 

Given a theorem object A = ([Tq], A*, Jq) installed in T G S and an interpre- 
tation object I = ([T], [T'],(p,J) G S as input, the operation transport-thm-obj 
would invoke create-thm-obj and install-thm-obj to create a new theorem object 
([T'], J') and install it in T'. The justification J' would be formed from 

Jo and J. 

Given a constant dp, a definition object D = ([Tq], Cq, Ea, J) installed in 
T G S , and an interpretation object I = ([T], [T'], <?, J) G S as input, if <?(a) = 
f3 and dp is not in the current language of T', the operation transport-def-obj 
would invoke create-def-obj and install-def-obj to create a new definition object 
([T'], d>{Ea), J') and install it in T'; otherwise, the operation fails. The justi- 

fication J' would be formed from Jq and J. An operation transport-pro-obj could 
be defined similarly. 

Given theory objects T,T' G S and an interpretation object I = 
([To], [Tq], <?, J) G S as input, if Tq < T and Tq < T', the operation 
instantiate-thy would invoke create-thy-obj to create a new theory object T" and 
create-int-obj to create a new interpretation object = ([T], [T"j, <?', J) such 
that: 

— T" is an extension of T' obtained by “instantiating” Tq in T with T'. How 
T' is cemented to the part of T outside of Tq is determined by d>. The 
constants of T which are not in Tq may need to be renamed and retagged 
to avoid conflicts with the constants in T'. 

— d>' is an interpretation of thy(T) in thy(T") which extends d>. 

For further details, see [7] . 

This notion of theory instantiation is closely related to the notion of theory 
instantiation proposed by Burstall and Goguen [2] ; in both approaches a theory 
is instantiated via an interpretation. However, in our approach, any theory can 
be instantiated with respect to any of its subtheories. In the Burstall-Goguen ap- 
proach, only “parameterized theories” can be instantiated and only with respect 
to the explicit parameter of the parameterized theory. 
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5.2 Foundational Theory Development System 

A theory development system is foundational if every theory developed in the 
system is consistent relative to one or more “foundational” theories which are 
known or regarded to be consistent. Since the operations for installing theorems, 
definitions, and profiles in a theory always produce conservative extensions of the 
original theory by Theorem 3, these operations preserve consistency. Therefore, 
a foundational theory development system can be implemented on top of the 
infrastructure design by simply using a new operation for creating theory objects 
that is successful only when the theory stored in the object is consistent relative 
to one of the foundational theories. 

The new operation can be defined as follows. Suppose T* is a foundational 
theory. Given a string n, a language L, a set F of sentences, theory objects 
Ti, . . . , Tm G S, a translation <P, and a justification J as input, if J is a justifi- 
cation that is an interpretation of T = {L,F) in thy(T*), the new operation 
would invoke create-thy-obj on (n, L, F, {[Ti], . . . , [T™]}) to create a theory ob- 
ject T and then invoke create-int-obj on ([T], [T*],<?, J) to create an interpreta- 
tion object I; otherwise the operation fails. If the operation is successful and J 
is correct, then thy(T) would be consistent relative to thy(T*) by Theorem 1. 

5.3 Encapsulated Theory Development 

Proving a theorem in a theory may require introducing several definitions and 
proving several lemmas in the theory that would not be useful after the theorem 
is proved. Such “local” definitions and lemmas would become logical clutter in 
the theory. One strategy for handling this kind of clutter is to encapsulate local 
development in a auxiliary theory so that it can be separated from the devel- 
opment of the main theory. The infrastructure design makes this encapsulation 
possible. 

Suppose that one would like to prove a theorem in a theory stored in theory 
object T using some local definitions and lemmas. One could use create-thy-copy 
to create a copy T' of T and create-int-obj to create a interpretation object I 
storing the identity interpretation of thy(T') in thy(T). Next the needed local 
definitions and lemmas could be installed as definition and theorem objects in 
T'. Then the theorem could be proved and installed as a theorem object in Tb 
Finally, the theorem could be transported back to T using the interpretation 
stored in I. The whole local development needed to prove the theorem would 
reside in T' completely outside of the development of T. 

A different way to encapsulate local theory development is used in the ACL2 
theorem prover [15]. 

5.4 Sequent-Style Proof System 

A goal-oriented sequent-style proof system can be built on top of the intertheory 
infrastructure. A sequent would have the form T — > A* where T is a theory 
object called the context and A* is a sentence in the current language of T 
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called the assertion. The system would include the usual inference rules of a 
sequent-style proof system plus rules to: 

— Install a theorem, definition, or profile into the context of a sequent. 

— Transport a theorem, definition, or profile from a theory object to the context 
of a sequent. 

Some of the proof rules, such as the deduction rule, would add or remove axioms 
from the context of a sequent, thereby defining new theory objects. The proof 
rules for the rules of universal generalization and existential generalization would 
be implemented by installing a profile in the context of a sequent. 

A sentence A* in the current language of a theory object T would be proved as 
follows, create-thy-copy would be used to create a copy T' of T and create-int-obj 
would be used to create a interpretation object I storing the identity interpreta- 
tion of thy(T') in thy(T). Then the sequent T' ^ A* would be proved, possibly 
with the help of local or imported definitions and lemmas. The contexts created 
in the course of the proof would be distinct supertheories of T^ A theorem or 
definition installed in a context appearing in some part of the proof would be 
available wherever else the context appeared in the proof. 

When the proof is finished. A* would be installed as a theorem object in Tb 
The theorem could be then transported back to T using the interpretation stored 
in I. The theory objects needed for the proof — T' and its supertheories — would 
be separated from T and the other theory objects in S. 
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Abstract. A computer implementation of Godel’s algorithm for class 
formation in Mathematical^ is useful for automated reasoning in set 
theory. The original intent was to forge a convenient preprocessing tool 
to help prepare input files for McGune’s automated reasoning program 
Otter. The program is also valuable for discovering new theorems. Some 
applications are described, especially to the definition of functions. A 
brief extract from the program is included in an appendix. 



1 Introduction 

Robert Boyer et al. (1986) proposed clauses capturing the essence of Gddel’s 
finite axiomatization of the von Neumann-Bernays theory of sets and classes. 
Their work was simplified significantly by Art Quaife (1992a and 1992b). About 
four hundred theorems of elementary set theory were proved using McGune’s au- 
tomated reasoning program Otter. A certain degree of success has been achieved 
recently (1999a and 1999b) in extending Quaife’s work. Some elementary theo- 
rems of ordinal number theory were proved, based on Isbell’s definition (1960) of 
ordinal number, which does not require the axiom of regularity to be assumed. 

An admitted disadvantage of Gddel’s formalism is the absence of the usual 
class formation {x | p(x)} notation. Replacing the axiom schema for class for- 
mation are a small number of axioms for certain basic class constructions. Defi- 
nitions of classes must be expressed in terms of two basic classes, the universal 
class V and the membership relation E, and seven other basic class constructors: 
the unary constructors complement, domain, flip and rotate, and the binary 
constructors pairset, cart, intersection. Godel also included an axiom for 
inverse, but it can be deduced from the others. 



2 A Brief Description of the GOEDEL Program 

As a replacement for the axiom schema for class formation, Kurt Godel (1940) 
proved a fundamental Glass Existence Metatheorem Schema for class formation. 
His proof of this metatheorem is constructive; a recursive algorithm for convert- 
ing customary definitions of classes using class formation to expressions built out 
of the primitive constructors is presented, together with a proof of termination. 
An implementation of Gddel’s algorithm in Mathematica’^“ was created (1996) 
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to help prepare input files for proofs in set theory using McCune’s automated 
reasoning program Otter. 

The likelihood of success in proving theorems using programs like Otter 
depends critically on the simplicity of the definitions used and the brevity of the 
statements of the theorems to be proved. To mitigate the effects of combinatorial 
explosion, one typically sets a weight limit to exclude complicated expressions 
from being considered. Although combinatorial explosion can not be prevented, 
the idea is to snatch a proof quickly before the explosion gets well under way. 

Because one needs compact definitions for practical applications, and because 
the output of Godel’s original algorithm is typically extremely complicated, a 
large number of simplification rules were added to the Mathematica implemen- 
tation of Godel’s algorithm. With the addition of simplification rules, Gddel’s 
proof of termination no longer applies. No assurance can be given that the added 
simplification rules will not cause looping to occur, but we have tested the pro- 
gram on a suite of several thousand examples, and it appears that it can be used 
as a practical tool to help formulate definitions and to simplify the statements of 
theorems. The GOEDEL program contains no mechanism for carrying out deduc- 
tions, but it does sometimes manage to prove statements by simplifying them to 
True. 

Much of the complexity of Godel’s original algorithm stems from his use 
of Kuratowski’s definition for ordered pairs. The Mathematica implementation 
does not assume any particular contraction of ordered pairs, but instead includes 
additional rules to deal with ordered pairs. The self-membership rule in the 
original algorithm was modified because in our work on ordinal numbers the 
axiom of regularity is not assumed. 

The stripped down version of the GOEDEL program presented in the Appendix 
omits many membership rules for defined constructors as well as most of the 
simplification rules. The modified Godel’s algorithm is presented as a series of 
definitions for a Mathematica function class [x,p]. The first argument x, as- 
sumed to be the name of a set, must be either an atomic symbol, or an expression 
of the form pair[u, v] where u and v in turn are either atomic symbols or pairs, 
and so on. It should be noted that Gddel did not allow both u and v to be pairs, 
but this unnecessary limitation has been removed to make the formalism more 
flexible. The second argument p is some statement which can involve the vari- 
ables that appear in x, as well as other variables that may represent arbitrary 
classes (not just sets). The statement can contain quantifiers, but the quantified 
variables must be sets. The Godel algorithm does not apply to statements con- 
taining quantifiers over proper classes. The quantifiers f orall and exists used 
in the GOEDEL program are explicitly restricted to set variables. 

A few simple examples will be presented to illustrate how the GOEDEL pro- 
gram is used. For convenience, Mathematica style notation will be employed, 
which does not quite conform to the notational requirements of Otter. For ex- 
ample, Mathematica permits one to define intersection to be an associative 
and commutative function of any number of variables. For brevity we write 
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a — > b to mean that Mathematica input a produces Mathematica output b for 
some version of the GOEDEL program. 

The functions FIRST and SECOND which project out the first and second 
components of an ordered pair, respectively, can be specified as the classes 

class[pair[pair[x, y], z], equal[z, x]] > FIRST, 

class[pair[pair[x, y], z], equal[z, y]] — > SECOND. 

Examples which involve quantifiers include the domain and range of a relation: 

class[x, exists[y, member[pair[x, y], z]]] — > domain[z], 
class[y, exists[x,member[pair[x, y], z]]] — > range[z]. 

It is implicitly assumed that all quantified variables refer to sets, but the free 
variable z here can stand for any class. 

3 Eliminating Flip and Rotate 

Godel’s algorithm uses two special constructors flip[x] and rotate [x] which 
produce ternary relations. The ternary relation flip[x] is 

class [pair [pair [u, v], w], member [pair [pair [v, u] , w] , x]] 

while rotate [x] is 

class [pair [pair [u, v], w], member [pair [pair [v, w], u[, x[[. 

Because these functors are not widely used in mathematics, it may be of 
interest to note that they could be eliminated in favor of more familiar ones. 
One can rewrite flip[x] as composite[x, SWAP], where SWAP = flip[ld] is the 
relation 

class [pair [pair [u, v], pair[x, y]], and[equal[u, y], equal [v, xjj] — > SWAP. 

Note that the functions that project out the first and second members of an 
ordered pair are related by SECOND = fIip[FIRST] and FIRST = fIip[SECONDj. 

The general formula for rotate[x] is more complicated, but Godel’s algorithm 
actually only involves the special case where x is a Gartesian product. In this 
special case one has the simple formula, 

rotate[cart[x_, y_j] := composite[x, SECOND, id[cart[y, V]]j. 

Using these formulas, the constructors flip and rotate could be completely 
eliminated from Godel’s algorithm, as well as from Godel’s axioms for class the- 
ory, if one instead takes as primitives the constructors composite, inverse, 
FIRST and SECOND. We have done so in the abbreviated version of the GOEDEL 
program listed in the Appendix. The function SWAP mentioned above, for exam- 
ple, could be defined in terms of these new primitives as 

intersection [composite [inverse [FIRST] .SECOND] , 

composite [inverse [SECOND] .FIRST] ] := SWAP. 
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4 Equational Set Theory without Variables 

The simplification rules in the GDEDEL program can be used not only to simplify 
descriptions of classes, but can also be induced to simplify statements. Given 
any statement p, one can form the class class [w, p] where w is any variable that 
does not occur in the statement p. This class is the universal class V if p is true, 
and is the empty class when p is false. One can form a new statement equivalent 
to the original one by the definition 

assert [p_] Module[{w = Unique[]}, equal[V, class[w, p]]] 

The occurrence of class causes Godel’s algorithm to be invoked, the meaning 
of the statement p to be interpreted, and the simplification rules in the GDEDEL 
program to be applied. While there can be no assurance the transformed state- 
ment will actually be simpler than the statement one started with, in practice 
it often is. For instance, the input 

assert[equal[composite[cross[x, x], DUP], composite[DUP, x]]] 

produces the statement FUNCTlDN[composite[ld, x]] as output. To improve the 
readability of the output, in the current version of the GDEDEL program, rules 
have been added which may convert the equations obtained with assert back 
to simpler non-equational statements. 

Since some theorem provers are limited to equational statements, it is of 
interest to reformulate set theory in equational terms. Alfred Tarski and Steven 
Givant (1987) have shown that all statements of set theory can be reformulated 
as equations without variables, somewhat reminiscent of combinatory logic. But 
whereas combinatory logic uses function-like objects as primitives, their calculus 
is based on the theory of relations. It has recently been proposed by Omodeo 
and Formisano (1998) that this formalism be used to recast set theory in a form 
accessible to purely equational automated reasoning programs. It is interesting 
to note that the assert mechanism in the GDEDEL program achieves the same 
objective. Any statement is converted by assert into an equation of the form 
equal[V, x]. If one prefers, one may also write this equation in the equivalent 
form equal[0, complement[x]]. 

Another consequence of the assert process is that one can always convert 
negative statements into positive ones. For example, the negative statement 
not[equal[0, x]] is converted by assert into the equivalent positive statement 
equal [V, image [V, x]]. Thus it appears that at least in set theory it does not 
make too much sense to make a big distinction between positive and nega- 
tive literals, because one can always convert the one into the other. Also, one 
can always convert a clause with several literals into a unit clause; the clause 
or[equal[0, x], equal[0, y]], for example, is equivalent to the unit clause 

equal[0, intersection[image[V, x], image[V, y]]]. 

The class image [V, x] which appears in these expressions is equal to the empty 
set if X is empty, and is equal to the universal class V if x is not empty. This class 
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is quite useful for reformulating conditional statements as unconditional ones. 
Many equations in set theory hold only for sets and not for proper classes. For 
example, the union of the singleton of a class x is x when x is a set, but is the 
empty set otherwise. This rule can be written as a single equation which applies 
to both cases as follows: 

U[singleton[x_]] := intersection[x, image[V, singleton[x]]] 

(This is in fact one of the thousands of rules in the GOEDEL program.) Although 
such unconditional statements initially appear to be more complex than the con- 
ditional statements that they replace, experience both with Otter and with the 
GOEDEL program indicates that the unconditional statements are in fact prefer- 
able. In Otter the unconditional rule can often be added to the demodulator 
list. In Mathematica, an unconditional simplification rule generally works faster 
than a conditional one. 

When assert is applied to a statement containing quantifiers, the statement 
is converted to a logically equivalent equation without quantifiers. All quantified 
variables are eliminated. What happens is that the quantifiers are neatly built 
into equivalent set-theoretic constructs like domain and composite. For example, 
the axiom of regularity is usually formulated using quantifiers as: 

implies[not[equal[x, 0]], exists[u, and[member[u, x], disjoint[u, x]]]]. 

When assert is applied, this statement is automatically converted into the 
equivalent quantifier-free statement 

or[equal[0, x], not [subclass [x, complement[P[complement[x]]]]]]. 

In this case the quantifier was hidden in the introduced power class functor. 
Replacing x by its complement, one obtains the following neat reformulation of 
the axiom of regularity: 

implies[subclass[P[x], x], equal[x, V]]. 

That is, the axiom of regularity says that the universal class is the only class 
which contains its own power class. When the axiom of regularity is not assumed, 
there may be other classes with this property. In particular, the Russell class 
RUSSELL = complement [fix[E]] has this property, a fact that is useful in the 
Otter proofs in ordinal number theory. 

This reformulation of the axiom of regularity has the advantage over the 
original one in that its clausification does not introduce new Skolem functions. 

5 Functions, Vertical Sections, and Cancellation Machines 

The process of eliminating variables and hiding quantifiers is facilitated by having 
available a supply of standard functions corresponding to the primitive construc- 
tors, as well as important derived constructors. For example, Quaife introduced 
the function SUCC corresponding to the constructor 

succ[x_] := union[x, singleton[x]] 
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so that the statement that the set omega of natural numbers is closed under 
the successor operation could be written in the compact variable-free form as 
the condition subclass[image[SUCC, omega], omega]. This is just one of the many 
techniques that Quaife exploited to reduce the plethora of Skolem functions that 
had appeared in the earlier work of Robert Boyer, et al. 

Replacing the function symbols of first order logic by bonafide set-theoretic 
functions not only helps to eliminate Skolem functions, but also improves the 
readability of the statements of theorems. A standard way to obtain definitions 
for most of these functions is in terms of a basic constructor VERTSECT, enabling 
one to introduce a lambda calculus for defining functions by specifying the re- 
sult obtained when they are applied to an input. The basic idea is not limited 
to functions; any relation can be specified by giving a formula for its vertical 
sections. The vertical sections of a relation z are the family of classes 

image[z, singleton[x]] = class[y,member[pair[x,y],z]]. 

One is naturally led to introduce the function which assigns these vertical 
sections: 

VERTSECT[z] == class[pair[x, y], equal[y, image[z, singleton[x]]]] 

(Formisano and Omodeo (1998) call this function V(z).) Godel’s algorithm con- 
verts this formula to the expression 

VERTSECT [z] == composite [Id, intersection [ 

complement [composite [E, complement [z] ] ] , 
complement [composite [complement [E] , z] ] ] ] . 

Of course, for many relations z the vertical sections need not be sets. The domain 
of VERTSECT[z] in general is the class of all sets x for which image[z, singleton[x]] 
is also a set. We call a relation thin when all vertical sections are sets. In addi- 
tion to functions, there are many important relations, such as inverse [E] and 
inverse[S], that are thin. 

Using Otter, we have proved many facts about VERTSECT, making it unnec- 
essary to repeat such work for individual functions. 

When f is a function, image[f , singleton[x]] is a singleton, and one can 
select the element in that singleton by applying either the sum class operation 
U, as Quaife does, or by applying the unary intersection operation A defined by 

class[u, f orall[v, implies[member[v, x],member[u, v]]]] > A[x] 

or equivalently, 

complement[image[complement[inverse[E]], x]] > A[x]. 

The difference between using U and A only affects the case that x is a proper 
class. Nevertheless, using A instead of U in the definition of application has many 
practical advantages. 

For example, one can use VERTSECT to obtain a formula for any function from 
a formula for its application A[image[f , singleton[x]]]. This can be done neatly 
in the GOEDEL program by introducing the Mathematica definition 
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lambda [x_ , e_] : = 

Module [{y=Unique [] } , VERTSECT [class [pair [x , y] , member [y , e] ] ] ] 

This Mathematica function lambda satisfies: 

FUNCTlDN[f] := True; lambda[x, A[image[f , singleton[x]]]] — > f , 

It should be noted that nothing like this works when one replaces A by U be- 
cause U does not distinguish between 0 and singleton(O), whereas A does. 
For the constant function f := cart[x, singleton[0]], for example, one has 
U[image[f , singleton[y]]] — > 0, whereas 

A[image[f , singleton[y]]] — > 

complement [image [V, intersection[x, singleton[y]]]]. 

Because the formula for U[image[f , singleton[y]]] has lost all information about 
the domain x of the function f , one cannot reconstruct f from this formula, but 
one can reconstruct f from the formula for A[image[f , singleton[y]]]. 

As examples of definitions obtained using lambda we mention the function 
SINGLETON which takes any set to its singleton, 

lambda[x, singleton[x]] — > VERTSECT[ld], 

and the function POWER which takes any set to its power set, 

lambda[x, P[x]] — > VERTSECT[inverse[S]]. 

The function VERTSECT[x] itself satisfies 

lambda[w, image[x, singleton[w]]] — > VERTSECT[x]. 

In addition to VERTSECT, it is also convenient to introduce a related construc- 
tor IMAGE, defined by 

VERTSECT[composite[x_, inverse[E]]] := IMAGE[x]. 

The function IMAGE [x] satisfies 

lambda[u, image[x,u]] — > IMAGE[x]. 

The definition IMAGE[inverse[E]] BIGCUP of the function BIGCUP which 
corresponds to the constructor U[x] was one of the first applications found 
for IMAGE. The function IMAGE [inverse [S] ] is the hereditary closure opera- 
tor, which takes any set x to its hereditary closure image [inverse [S] ,x] . This 
function is closely related to the POWER function mentioned earlier. The functions 
IMAGE [FIRST] and IMAGE [SECOND] take x to its domain and range, respectively, 
while IMAGE [SWAP] takes x to its inverse. The function IMAGE [cross [u, v] ] takes 
X to composite [v,x, inverse [u] ] . For example, the function that corresponds 
to the constructor flip is IMAGE [cross [SWAP , Id] ] . 
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The constructor IMAGE is not a functor in the category theory sense. The 
function IMAGE [x] does not in general preserve composites, but only when the 
right hand factor is thin: 

domain[VERTSECT[t]] := V; 

IMAGE[composite[x, t]] > composite[lMAGE[x], IMAGE[t]]. 

IMAGE preserves the global identity function: IMAGE[ld] — > Id; but in general 
IMAGE[id[x]] is not an identity function. It is nonetheless a useful function: 

Iainbda[w, intersection[x, w]] — > IMAGE[id[x]]. 

An important application of VERTSECT is to provide a mechanism for recov- 
ering a function f from a formula for composite[inverse[E], f]. One can use 
VERTSECT to cancel factors of inverse [E]; for example, the Mathematica input 

FUNCTION [fl] := True; FUNCTION [f 2] := True; 
domain [fl] := V; domain [f 2] := V; 

Map [VERTSECT , composite [inverse [E] , fl] ==composite [inverse [E] ,f2]] 

produces the output fl == f2. When the assumption about the domains of 
the functions are omitted, the results are slightly more complicated, but one 
nonetheless can obtain a formula for each function in terms of the other. 

It is possible to use VERTSECT to construct other such cancellation machines 
which cancel factors of S, inverse[S] or DISJOINT. These machines were found 
to be quite useful in our investigations of the binary functions which correspond 
to the constructors intersection, cart, union and so forth. 

6 Binary Functions and Proof by Rotation 

Binary functions such as CART, CAP, CUP, corresponding to the constructors cart, 
intersection, union, are important for obtaining variable-free expressions in 
many applications. To apply the lambda formalism to these functions, it is con- 
venient to introduce the abbreviations 

first[x_] := A [domain [singleton [x] ] ; 
second [x_] := A [range [singleton [x] ] . 

One then has 

Iambda[x, intersection[f irst[x], second[x]]] — > CAP, 

Iambda[x, union[f irst[x], second[x]]] — > CUP, 

Iambda[x, cart [first [x], second[x]]] > CART, 

(It should be noted that first and second here are technically different from 
the rather similar constructors 1st and 2nd introduced by Quaife.) 

Although Godel’s rotate functor can be completely eliminated, nevertheless 
it does in fact have many desirable properties. For example, the rotate functor 
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preserves unions, intersections and relative complements, whereas composite 
preserves only unions. In the study of binary functions, the rotate constructor 
has turned out to be extremely useful. Often we can take one equation for binary 
functions and rotate it to obtain another. 

The SYMDIF function corresponding to the symmetric difference operation 
is rotation invariant. Schroeder’s transposition theorem can be given a succinct 
variable-free formulation as the statement that the relation 

composite[DISJDINT, COMPOSE] 

is rotation invariant, where DISJOINT is class[pair[x, y], disjoint[x, yj], and 
COMPOSE is the binary function corresponding to composite. 

We mention three applications of these binary functions for defining classes. 
The class of all transitive relations can be specified as: 

class [x, subclass [composite[x, x], xj] > f ix[composite[S, COMPOSE, DUP]]. 

The class of all disjoint collections, specified as the input 

class [z.forall [x,y, implies [and [member [x,z] .member [y,z] ] , 

or [equal [x , y] , disjoint [x , y] ] ] ] ] 



produces 

f ix[image[inverse[CART], P[union[DISJOINT, Id]]]] 

as output. The class of all topologies, input as 

class [t , and [subclass [image [BIGCUP.P [t] ] , t] , 

subclass [image [CAP , cart [t , t] ] , t] ] ] 

produces the output 

intersection [ 

complement [fix [composite [complement [E] ,BIGCUP, inverse [S] ]] ] , 
fix [composite [S, IMAGE [CAP] ,CART,DUP]]] . 

7 Conclusion 

Proving theorems in set theory with a first order theorem prover such as Otter is 
greatly facilitated by the use of a companion program GOEDEL which permits one 
to automatically translate from the notations commonly used in mathematics to 
the special language needed for the Godel theory of classes. 

Having an arsenal of set-theoretic functions that correspond to the function 
symbols of first order logic proves to be useful for systematically eliminating ex- 
istential quantifiers and thereby avoiding the Skolem functions produced when 
formulas with existential quantifiers are converted to clause form. Although the 
main focus in this talk was on the use of the GOEDEL program to help find conve- 
nient definitions for all these functions, the GOEDEL program also permits one to 
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discover useful formulas that these functions satisfy. By adding these formulas 
as new simplification rules, the program has grown increasingly powerful over 
the years. 

The GOEDEL program currently contains well over three thousand simplifica- 
tion rules, many of which have been proved valid using Otter. The simplification 
rules can be used not only to simpify definitions, but also to simplify statements. 
This power to simplify statements has led to the discovery of many new formu- 
las, especially new demodulators. Experience with Otter indicates that searches 
for proofs are dramatically improved by the presence of demodulators even when 
they are not directly used in the proof of a theorem because they help to combat 
combinatorial explosion. 



Appendix. An Extract from the GOEDEL Program 

Print [" rPackcige Title: GOEDEL. M 2000 January 13 at 6:45 a.m. "] ; 

(* 

: Context: Goedel' 

:Mathematica Version: 3.0 :Author: Johan G. F. Belinfante 

: Summary: The GOEDEL program implements Goedel’s algorithm for class 
formation, modified to avoid assuming the axiom of regularity, and 
Kuratowski’s construction of ordered pairs. 

:Sources: <description of algorithm, information for experts> 

Kurt Goedel, 1939 monograph on consistency of the axiom of choice and 
the generalized continuum hypothesis, pp. 9-14. 

:Warnings: <description of global effects, incompatibilities> 

0 is used to represent the empty set. 

E is used to represent the membership relation. 

:Limitations: <special cases not handled, known problems> 

The simplification rules cire not confluent; termination is not assured. 

There is no user control over the order that simplification rules are applied. 

This stripped down version of G0EDEL51.A23 lacks 95Z of the simplification 
rules needed to produce good output. Mathematica’s builtin Tracing co mman ds 
are the only mechanism for discovering what rules were actually applied. 

:Examples: Sample files aire available for various test suites. 

*) 

BeginPackage ["Goedel' "] 

and: : usage = "and[x,y , . . .] is conjunction" 

assert: :usage = "assert[p] produces a statement equivalent to p by applying Goedel’s 
algorthm to class [w,p]. Applying assert repeatedly sometimes simplifies a statement." 

cart::usage = "cart[x,y] is the ceirtesian product of classes x and y." 

class: :usage = "class[x,p] applies Goedel’s algorthm to the class of all sets x 
satisfying the condition p. The vairiable x may be atomic, or of the form pair[u,v], 
where u and v in turn can be pairs, etc." 

complement :: usage = "complement [x] is the class of all sets that do not belong to x" 
composite: : usage = "composite[x,y, . . .] composite of x,y, ... " 
domain: :usage = "domain[x] is the domain of x" 

E: : usage = "E is the membership relation" 

equal: :usage = "equal[x,y] is the statement that the classes x and y are equal" 
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exists: rusage = "exists[x,y, . . . ,p] means there are sets x,y, ... such that p" 

FIRST: : usage = "FIRST is the function which tcikes pair[x,y] to y" 

forall: :usage = "forall[x,y, . . . , p] means that p holds for all sets x,y,..." 

Id: :usage = "Id is the identity relation" 

id: :usage = "id[x] is the restriction of the identity relation to x" 
image: :usage = "image[x,y] is the imeige of the class y under x" 

intersection :: uscige = "intersection[x,y, . . .] is the intersection of classes x,y,.. 
inverse: : usage = "the relation inverseCx] is the inverse of x" 

LeftPairV: : usage = "LeftPairV is the function that tcikes x to pair[V,x]" 
member: : usage = "member [x,y] is the statement that x belongs to y" 
not::usage = "not[p] represents the negation of p" 
or: : usage = "or[x,y,...] is the inclusive or" 

P: :usage = "the power class P[x] is the class of all subsets of x" 
pair: : usage = "pair[x,y] is the ordered pair of x and y." 
range:: usage = "range [x] is the range of x" 

RightPairV :: usage = "RightPairV is the function that tcikes x to pair[x,V]" 

S::usage = "S is the subset relation" 

SECOND: : usage = "SECOND is the function that maps pair[x,y] to y" 

singleton: :usage = "singleton[x] has no member except x; it is 0 if x is not a set 
subclass :: usage = "subclass[x,y] is the statement that x is contained in y" 

U: :usage = "the sum class U[x] is the union of all sets belonging to x" 
union: : usage = "union[x,y, . . .] is the union of the classes x,y,... " 

V::usage = "the universal class" 

Begin[" ‘Private' "] (* begin the private context *) 

(* definitions of auxiliairy functions not exported *) 
varlist[u_] := {u} /; AtomQ[u] 

varlist [pair [u_ , v_] ] := Union[varlist [u] , varlist [v] ] 

(* Is the expression x free of all variables which occur in y? *) 
allfreeQ [x_ ,y_] := Apply [And , Map [FreeQ [x,#] & , varlist [y] ] ] 

(* definitions of exported functions *) 

(* Rules that must be assigned before attributes are set. *) 
and [p_] : = p 
or [p_] : = p 

Attributes [and] := {Flat, Orderless, Oneldentity)- 
Attributes [or] := {Flat, Orderless, Oneldentity} 

composite [x_] := composite [Id, x] 

intersection [x_] := x 

union [x_] := x 

Attributes [composite] := {Flat , Oneldentity} 

Attributes [intersection] := {Flat, Orderless, Oneldentity} 

Attributes [union] := {Flat, Orderless, Oneldentity} 
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not [True] := False (* Truth Table *) 

not [False] : = True 

(* abbreviation for multiple quantifiers *) 
exists[x_,y ,p_] := exists[x,exists[y,p]] 

(* elimination rule for universal quantifiers *) 
f orall [x ,y_] := not [exists [x ,not [y] ] ] 

(* basic rules for membership *) 
member [u_,0] := False 

(* Added to avoid assuming cixiom of reguleirity. Goedel assumes member [x,x] =0. *) 
class [w_ , member [x_ ,x_] ] : = 

Module [-[y=Unique [] }, class [w, exists [y , and [member [x ,y] , equal [x,y] ] ] ] ] 

class [z_ , member [w_ , cart [x_ ,y_] ] ] := Module[{u = Unique[],v = Unique[]]-, 

class [z, exists [u ,v ,and[ equal [pair [u, v] , w] , member [u,x] , member [v ,y] ] ] ] ] 

member [pair [u_ ,v_] , cart [x_ ,y_] ] := and [member [u,x] , member [v,y] ] 

member [u_ , complement [x_] ] : = and [member [u , V] , not [member [u , x] ] ] 

class [w_ , member [z_ , composite [x_, y_] ] ] := Module [-Ct=Unique [] ,u=Unique[] ,v=Unique[] }, 

class [w, exists [t ,u ,v, and [equal [z ,pair [u, v] ] , and [member [pair [u, t] ,y] , 
member [pair [t , v] , x] ] ] ] ] ] 

class [z_ , member [w_ , cross [x_ , y_] ] ] : = 

Module [-[ul=Unique [] ,u2=Unique[] ,vl=Unique[] ,v2=Unique []}, 
class [z, exists [ul ,u2 , vl ,v2 , and [equal [pair [pair [ul ,u2] ,pair [vl , v2] ] ,w] , 
member [pair [ul ,vl] ,x] , member [pair [u2,v2] ,y]]]]] 

(* Goedel’s definition 1.5 *) 

class [w_ , member [u_ , domain [x_] ] ] := Module[-[v = Unique []}■, class [w, exists [v, 

and [member [u,V] ,member[pair [u, v] ,x]]]]] 

class [z_ , member [w_ ,E] ] := Module[{u = Unique[],v = Unique []]■, class [z, exists [u,v, 

and [equal [pair [u , v] , w] , member [u , v] ] ] ] ] 

class [w_ , member [x_ , FIRST] ] := Module[{u = Unique[],v = Unique []}-, class [w, 

exists [u,v , equal [pair [pair [u, v] ,u] , x] ] ] ] 

class [z_ , member [w_ , Id] ] := Module[{u = Unique []}, class [z, exists [u, equal [w, pair [u,u] ]]] ] 

class [z_ , member [w_ , id [x_] ] ] := Module [{u = Unique []}-, class [z , exists [u , 

and [member [u , x] , equal [w , pair [u , u] ] ] ] ] ] 

class [w_ , member [v_ , image [z_ ,x_] ] ] := Module[{u = Unique []}, class [w, exists [u, 

and [member [v, V] , member [u ,x] , member [pair [u,v] , z] ] ] ] ] 

member [u_ , intersection[x_ ,y_] ] := and[member [u,x] , member [u,y] ] 

class [x_ , member [w_ , inverse [z_] ] ] := Module[{u = Unique[],v = Unique[]]-, 

class [x, exists [u ,v ,and[ equal [pair [u, v] , w] , member [pair [v,u] ,z] ] ] ] ] 

class [x_ , member [w_ ,LeftPairV] ] := Module[-[u = Unique[],v = Unique []}-, class [x , 

exists [u ,v, and [equal [pair [u, v] , w] , equal [v,pair [V,u] ] ] ] ] ] 

member [x_,P[y_]] := and [member [x ,V] , subclass [x ,y] ] 

class [u_ ,member [v_ ,pair [x_ ,y_] ] ] := Module [{z=Unique []}-, class [u, exists [z, 

and [equal [pair [x , y] , z] , member [ v , z] ] ] ] ] 

class [w_ ,member[v_ , range [z_] ] ] := Module [{u = Unique []}■, class [w, exists [u, 

and [member [v , V] , member [pair [u , v] , z] ] ] ] ] 



class [x_ , member [w_ ,RightPairV] ] := Module[-[u = Unique[],v = Unique [] }, class [x, 

exists [u,v, and [equal [pair [u, v] , w] , equal [v,pair [u,V] ] ] ] ] ] 
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class [w_ , member [x_ , S] ] := Module[{u = Unique[],v = Unique []}, class [w, exists [u,v 

and [equal [pair [u ,v] ,x] , subclass [u, v] ] ] ] ] 

class [w_ , member [x_ , SECOND] ] := Module[{u = Unique[],v = Unique []}-, class [w, 

exists [u,v, equal [pair [pair [u, v] , v] , x] ] ] ] 

member [u_ , singleton [x_] ] := and [equal [u,x] , member [u,V] ] 

member [u_, union [x_,y_]] := or [member [u, x] , member [u,y] ] 

class [w_ , member [x_ ,U[z_] ] ] := Module[{y = Unique []}, class [w, exists [y, 

and [member [x , y] , member [y , z] ] ] ] ] 

class [w_ , subclass [x_ ,y_] ] := Module[{u = Unique [])■, class [w, 

forall [u, or [not [member [u,x] ] , member [u , y] ]]] ] 

class [x_ , False] :=0 

class [x_ , True] :=V /; AtomQCx] 

class [pair [u_ ,v_] , True] := cairt [class [u, True] , class [v, True] ] 
class [u_ , member [u_ ,x_] ] := x /; And [FreeQ [x,u] , AtomQ [u] ] 

(* axiom B.l membership relation *) 

class [pair [u_ ,v_] , member [u_ ,v_] ] := E /; And [AtomQ [u] , AtomQ [v] ] 

(* axiom B.2 intersection *) 

class [x_ , and [p_ ,q_] ] := intersection[class [x ,p] , class [x ,q] ] 

class [x_ , or [p_ ,q_] ] := union[class[x,p] , class [x,q]] 

(* axiom B.3 complement *) 

class [x_ ,not [p_] ] := intersect ion [complement [class [x,p] ] , class [x,True] ] 

(* axiom B.4 domain and Goedel’s equation 2.8 on page 9 *) 
class [x_ , exists [y_ ,p_] ] := domain[class [pair [x ,y] ,p] ] 

(* axiom B.5 cartesian product *) 

class [pair [u_ , v_] , member [u_ , x_] ] : = 

cart[x,V] /; And [FreeQ [x,u] , FreeQ [x, v] , AtomQ [u] , AtomQ [v] ] 

(* axiom B.6 inverse *) 

class [pair [u_ ,v_] , member [v_ ,u_] ] := inverse[E] /; And [AtomQ [u] , AtomQ [v] ] 

(* an interpretation of Goedel’s equation 2.41 on page 9 *) 

class [pair [u_ ,v_] ,p_] := Ccirt [class [u,p] , class [v, True] ] /; allfreeQ [p ,v] 

(* an interpretation of Goedel’s equation 2.7 on page 9 *) 

class [pair [u_ ,v_] ,p_] := cairt [class [u, True] , class [v ,p] ] /; allfreeQ [p ,u] 

(* Four rules to replace the rotation rules on Goedel’s page 9: *) 

class [pair [pair [u_ , v_] , w_] ,p_] := composite [class [pair [v,w] ,p] , SECOND , 

id [cart [class [u , True] ,V] ] ] /; allfreeQ [p ,u] 

class [pair [pair [u_ , v_] , w_] ,p_] := composite [class [pair [u,w] ,p] , FIRST, 

id [cart [V, class [v , True] ]] ] /; allfreeQ [p ,v] 

class [pair [w_ , pair [u_ ,v_] ] ,p_] := composite [id [cart [class [u, True] ,V]] , 

inverse [SECOND] , class [pair [w, v] ,p] ] /; allfreeQ [p ,u] 

class [pair [w_ ,pair [u_ ,v_] ] ,p_] := composite [id [cart [V, class [v,True] ] ] , 

inverse [FIRST] , class [pair [w,u] ,p] ] /; allfreeQ [p ,v] 

(* special maneuver on peige 10 of Goedel’s monograph *) 
class [u_ , member [x_ ,y_] ] := Module [{v = Unique []}, 

class [u, exists [v , and [equal [x, v] , member [v, y] ] ] ] ] / ; FreeQ [varlist [u] ,x] 
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(* new rules for equality *) 
equal [x_ , x_] : - True 

class [pair [u_ ,v_] , equal [u_ ,v_] ] := Id /; And[AtomQ [u] , AtomQ [v] ] 
class [pair [u_ ,v_] , equal [v_ ,u_] ] := Id /; And [AtomQ [u] , AtomQ [v] ] 

class [x_ , equal [x_ ,y_] ] : = 

intersection [singleton[y] , class [x, True] ] /; allfreeQ[y,x] 

(* Goedel’s Axiom A. 3 of Coextension. *) 
class [w_ , equal [x_ ,y_] ] : = 

intersection [class [w .subclass [x, y] ] , class [w, subclass [y ,x] ] ] /; 

And [Or [Not [MemberQ [vcirlist [w] ,x]] , Not [MemberQ [varlist [w] ,y]]] , 

Not [SameQ [Head [x] .pair] ] . Not [SameQ [Head[y] .pair] ] ] 

class [x_ .equal [y_ .x_] ] := intersection[singleton[y] , class [x, True] ] /; allfreeQ [y .x] 

equal [pair [x_ .y_] .0] := False 

equal [pair [x_ .y_] .V] := False 

(* equality of pairs *) 

equal [pair [u_ .v_] .pair[x_.y_]] := and [equal [singleton [u] , singleton [x] ] . 

equal [singleton [v] .singleton[y] ] ] 

class [w_ .equal [singleton[u_] , singleton[v_] ] ] := class [w, equal [u.v] ] /; 

MemberQ [varlist [w] .u] II MemberQ [varlist [w] , v] II 
member [u.v] II member [v.V] 

(* flip equations involving a single pair to put pair on the left *) 
equal [x_ .y_pair] := equal [y.x] 

(* rules that apply when x or y is known not to be a set *) 
pair[x_.y_] := pair[V.y] /; Not[V === x] && not [member [x.V] ] 
pair[x_.y_] := pair[x.V] /; Not[V === y] && not [member [y.V] ] 

(* rule that applies when z does not occur in Vcirlist[u] or when z occurs in x or y. ♦) 
class [u_ .equal [pair [x_ .y_] .z_] ] := Module [■[v=Unique []} , 

class [u. exists [v. and [equal [pair [x ,y] , v] , equal [v.z] ] ] ] ] / ; 

Not [MemberQ [varlist [u] .z] ] II Not [FreeQ [{x .y)-, z] ] 

(* rule that applies when z does occur in varlist [w] and z does not occur in either x or y. 

This rule only applies when x and y eire known to be sets. *) 
class [w_ .equal [pair [x_ .y_] .z_]] := Module [{u=Unique [] ,v=Unique [] } . 

class [(w/ .z->pair [u. v] ) .and [equal [x.u] , equal [y ,v] ] ] ] / ; 

And [MemberQ [varlist [w] .z] .FreeQ[{x,y},z] , 

Or [member [x.V] .MemberQ [veirlist [w] ,x]] , 

Or [member [y.V] .MemberQ [vcirlist [w] ,y]]] 

(* rule that applies when one does not know whether or not x is a set *) 
class [u_ .equal [pair [x_ .y_] .z_] ] := Module [{v=Unique []} , 

class [u.or [and [not [member [x, V] ] , equal [pair [V.y] , z] ] , 
exists [v. and [equal [x.v] .equal [pair [v, y] ,z]]]]]] /; 

Not [MemberQ [varlist [u] .x] ] && UnsameQ[V,x] && Not [member [x.V] === True] 

(* rule that applies when one does not know whether or not y is a set *) 
class [u_ .equal [pair [x_ .y_] .z_] ] := Module [{v=Unique []} , 

class [u.or [and [not [member [y , V] ] , equal [pair [x ,V] , z] ] , 
exists [v. and [equal [y.v] .equal [pair [x, v] ,z]]]]]] /; 

Not [MemberQ [varlist [u] .y] ] && UnsameQ[V,y] && Not [member [y.V] === True] 

class [pair [u_ .v_] .equal [pair [V,u_] , v_] ] := LeftPairV 

class [pair [u_ .v_] .equal [pair [u_, V] , v_] ] := RightPairV 

class [pair [u_ . v_] . equal [pair [V , v_] , u_] ] : = inverse [LeftPairV] 

class [pair [u_ . v_] . equal [pair [v_ , V] , u_] ] : = inverse [RightPairV] 

image [inverse [RightPairV] .x_] := 0 /; composite [Id, x] == x 

image [inverse [LeftPairV] ,x_] := 0 /; composite [Id, x] == x 
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class [w_ , equal [pair [V ,y_] ,z_] ] := Module[-[v = Unique[]}-, 

class [w ,or [and [not [member [y, V] ] , equal [pair [V,V] ,z] ] , 

and [member [y ,V] , exist s [v, and [equal [pair [V, v] ,z] , equal [v,y] ]]]]]] /; 

Not [allfreeQ [y ,w] ] 

class [w_ , equal [pair [x_ ,V] ,z_] ] := Module[{v = Unique[]>, 

class [w ,or [and [not [member [x, V] ] , equal [pair [V,V] ,z] ] , 

and [member [x ,V] , exist s [v, and [equal [pair [v, V] ,z] , equal [v,x] ]]]]]] /; 

Not [allfreeQ [x ,w] ] 

class [w_ , equal [pair [V ,V] ,w_] ] := singleton[pair [V,V]] 

class [w_ , equal [pair [V ,V] ,x_] ] := Module[{v = Unique[]3-, 

class [w, exists [v, and [equal [pair [V, V] , v] , equal [v,x] ] ] ] ] / ; 

Not [MemberQ [x , veirlist [w] ] ] 

(* assertions *) 

assert[p_] := Module[{w = Unique[]}, equal [V, class [w,p] ] ] 

(* a few simplification rules *) 
cart[x_,0] := 0 
cart[0,x_] := 0 

complement [0] := V 

complement [complement [x_] ] := x 

complement [union [x_ ,y_] ] := intersect ion [complement [x] , complement [y] ] 

complement [V] := 0 

composite [x_ ,cairt [y_ ,z_] ] := Ccirt [y, image [x, z] ] 

composite [cart [x_ ,y_] ,z_] := cairt [image [inverse [z] , x] ,y] 

composite [Id ,x_ ,y_] := composite [x, y] 

composite [x_ , Id] := composite [Id ,x] 
composite [Id , Id] := Id 

domain [cart [x_ ,y_] ] := intersection [x , image [V, y] ] 

domain [composite [x_ ,y_] ] := image [inverse [y] , domain [x]] 

domain [Id] : = V 
domain [id [x_]] := x 

id [V] : = Id 

image[0,x_] := 0 
image [x_,0] := 0 

image [composite [x_ ,y_] ,z_] := image [x, image [y,z]] 

image [Id, x_] := x 

image [id [x_] ,y_] := intersection [x,y] 

intersection [ccirt [x_ ,y_] ,z_] := composite [id [y] ,z , id[x] ] 

intersection [V ,x_] := x 

inverse [0] : = 0 

inverse [cart [x_ ,y_] ] := cairt [y,x] 

inverse [complement [x_] ] := composite [Id, complement [inverse [x] ] ] 

inverse [composite [x_ ,y_] ] := composite [inverse [y] , inverse [x] ] 

inverse [Id] := Id 

inverse [inverse [x_] ] := composite [Id, x] 

range [Id] : - V 
union[0,x_] := x 

End[ ] (* end the private context *) 

Protect[ and, assert, cairt, class, complement, composite, domain, E, 
equal, exists, FIRST, forall. Id, id, image, intersection, inverse, 
LeftPairV, member, not, or, P, pair, range, RightPairV, S, SECOND, 
singleton, subclass, U, union, V ] 

EndPackage [ ] (* end the package context *) 
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Abstract. We provide techniques to integrate resolution logic with 
equality in type theory. The results may be rendered as follows. 

— A clausification procedure in type theory, equipped with a correct- 
ness proof, all encoded using higher-order primitive recursion. 

— A novel representation of clauses in minimal logic such that the 
A-representation of resolution proofs is linear in the size of the pre- 
misses. 

— A translation of resolution proofs into lambda terms, yielding a ver- 
ification procedure for those proofs. 

— The power of resolution theorem provers becomes available in inter- 
active proof construction systems based on type theory. 



1 Introduction 

Type theory (= typed Lambda Calculus) offers a powerful formalism for formal- 
izing mathematics. Strong points are: the logical foundation, the fact that proofs 
are first-class citizens, and the generality which naturally facilitates extensions, 
such as inductive types. Type theory captures definitions, reasoning and com- 
putation at various levels in an integrated way. In a type-theoretical system, 
formalized mathematical statements are represented by types, and their proofs 
are represented by A-terms. The problem whether tt is a proof of statement A 
reduces to checking whether the term tt has type A. Computation is based on a 
simple notion of rewriting. The level of detail is such that the well-formedness 
of definitions and the correctness of derivations can automatically be verified. 

However, there are also weak points. It is exactly the appraised expressivity 
and the level of detail that makes automation at the same time necessary and 
difficult. Automated deduction appears to be mostly successful in weak systems, 
such as propositional logic and predicate logic, systems that fall short to formal- 
ize a larger body of mathematics. Apart from the problem of the expressivity of 
these systems, only a minor part of the theorems that can be expressed can ac- 
tually be proved automatically. Therefore it is necessary to combine automated 
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theorem proving with interactive theorem proving. Recently a number of pro- 
posals in this direction have been made. In [MS99] Otter is combined with the 
Boyer-Moore theorem prover. (A verified program rechecks proofs generated by 
Otter.) In [Hur99] Gandalf is linked to HOL. (The translation generates scripts to 
be run by the HOL-system.) In [ST95], proofs are translated into Martin-Lof’s 
type theory, for the Horn clause fragment of first-order logic. In the Omega 
system [Hua96, Omega] various theorem provers have been linked to a natural 
deduction proof checker. The purpose there is to automatically generate proofs 
from so called proof plans. Our approach is different in that we generate complete 
proof objects for both the clausification and the refutation part. 

Resolution theorem provers, such as Bliksem [Blk], are powerful, but have the 
drawback that they work with normal forms of formulae, so-called clausal forms. 
Clauses are (universally closed) disjunctions of literals, and a literal is either an 
atom or a negated atom. The clausal form of a formula is essentially its Skolem- 
conjunctive normal form, which need not be exactly logically equivalent to the 
original formula. This makes resolution proofs hard to read and understand, 
and makes the interactive navigation of the theorem prover through the search 
space very difficult. Moreover, optimized implementations of proof procedures 
are error-prone (cf. recent CASC disqualifications). 

In type theory, the proof generation capabilities suffer from the small granu- 
larity of the inference steps and the corresponding astronomic size of the search 
space. Typically, one hyperresolution step requires a few dozens of inference steps 
in type theory. In order to make the formalisation of a large body of mathematics 
feasible, the level of automation of interactive proof construction systems such 
as Coq [Coq98], based on type theory, has to be improved. 

We propose the following proof procedure. Identify a non-trivial step in a 
Coq session that amounts to a first-order tautology. Export this tautology to 
Bliksem, and delegate the proof search to the Bliksem inference engine. Convert 
the resolution proof to type theoretic format and import the result back in Coq. 
We stress the fact that the above procedure is as secure as Coq. Hypothetical 
errors (e.g. the clausification procedure not producing clauses, possible errors 
in the resolution theorem prover or the erroneous formulation of the lambda 
terms corresponding to its proofs) are irrelevant because the resulting proofs are 
type-checked by Coq. The security could be made independent of Coq by using 
another type-checker. 

Most of the necessary meta-theory is already known. The negation normal 
form transformation can be axiomatized by classical logic. The prenex and con- 
junctive normal form transformations require that the domain is non-empty. 
Skolemization can be axiomatized by so-called Skolem axioms, which can be 
viewed as specific instances of the Axiom of Choice. Higher-order logic is partic- 
ularly suited for this axiomatization: we get logical equivalence modulo classical 
logic plus the Axiom of Choice, instead of awkward invariants as equiconsistency 
or equisatisfiability in the first-order case. 

By adapting a result of Kleene, Skolem functions and -axioms could be elim- 
inated from resolution proofs, which would allow us to obtain directly a proof of 
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the original formula (cf. [Pfe84]), but currently we still make use of the Axiom 
of Choice. 

The paper is organized as follows. In Section 2 we set out a two-level approach 
and define a deep embedding to represent first-order logic. Section 3 describes 
a uniform clausification procedure. We explain how resolution proofs are trans- 
lated into A-terms in Sections 4 and 5. Finally, the outlined constructions are 
demonstrated in Section 6. 



2 A Two-Level Approach 

The basic sorts in Coq are and An object M of type is a logical 
proposition and denotes the class of proofs of M. Objects of type are usual 
sets such as the set of natural numbers, lists etc. In type theory, the typing 
relation is expressed by t : T, to be interpreted as ‘t belongs to set T’ when 
T : and as ‘t is a proof of proposition T’ when T : *p. As usual, ^ associates 

to the right; — > is used for logical implication as well as for function spaces. 
Furthermore, well-typed application is denoted by (M N) and associates to the 
left. Scopes of bound variables are always extended to the right as far as possible. 
We use the notation 



constructor^) : • • 


• ^ T 


constructor n : • • 


■ T 



to define the inductive set T, that is: the smallest set of objects that is freely 
generated by constructor ^, . . . , constructor n- Moreover, we use 

{ patternQ rhsQ 
: : : 
pattern,^ rhsm 

for the exhaustive case analysis on t in the inductive type T. If t matches pattern 
it is replaced by the right-hand side rhsi. 

We choose for a deep embedding in adopting a two-level approach for the 
treatment of arbitrary first-order languages. The idea is to represent first-order 
formulae as objects in an inductive set o : accompanied by an interpretation 

function E that maps these objects into The next paragraphs explain why we 
distinguish a higher {meta-, logical) level and a lower {object-, computational) 
level 0 . 

The universe includes higher-order propositions; in fact it encompasses 
full impredicative type theory. As such, it is too large for our purposes. Given a 
suitable signature, any first-order proposition <p : will have a formal counter- 

part p : 0 such that (p equals {E p), the interpretation of p. Thus, the first-order 

Both 0 as well as E depend on a fixed but arbitrary signature. 



1 
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fragment of can be identified as the collection of interpretations of objects in 

o. 

Secondly, Coq supplies only limited computational power on whereas o, 
as every inductive set, is equipped with the powerful computational device of 
higher-order primitive recursion. This enables the syntactical manipulation of 
object-level propositions. 

Reflection of object-level propositions is used for the proof construction of 
first-order formulae in in the following way. Let (p : be a first-order propo- 

sition. Then there is some (p : o such that {E ip) is convertible with (p.'^ Moreover, 
suppose we have proved 



\fp-.o.{E (T p))^{E p) 

for some function T : o ^ o. Then, to prove ip it suffices to prove {E (T (p)). 
Matters are presented schematically in Figure 1. In Section 3 we discuss a con- 
crete function T, for which we have proved the above. For this T, proofs of 
{E {T ip)) will be generated automatically, as will be described in Sections 4 and 
5. 



< {E {T ip)) meta-level 

B b 

{T ip) object-level o 

Fig. 1. Schematic overview of the general procedure. The proof of the implication 
from (E (T (p)) to ip can be generated uniformly in ip. 




Object-Level Propositions and the Reflection Operation 

In Coq, we have constructed a general framework to represent first-order lan- 
guages with multiple sorts. Bliksem is (as yet) one-sorted, so we describe the 
setup for one-sorted signatures only. 

Assume a domain of discourse ct : Suppose we have relation symbols 

Ro, ■ ■ - ,Rk typed (t®“ ^ . . . , respectively. Here cq, . . . , Cfc are natu- 

ral numbers and ct" is the Cartesian product of n copies of a, that is: 

( 7 ° = unit = a = ct x 

The set unit is a singleton with sole inhabitant tt. 



^ The mapping ' is a syntax- based translation outside Coq. 
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Let L be the non-empty^ list of arities [eo, . . . , Cfc]. We define o, the set of objects 
representing propositions, inductively by: 

{ rel : Hi -.{index L). ^ ^ o 

^-.0^0 
V, A : 0 — > o — > o 
V, 3 : (ct ^ o) — > o 

We use the dot-notation ' to distinguish the object-level constructors from Coq’s 
predefined connectives. Connectives are written infix. The function select is of 
type UL : {nelist nat). {index L) — > nat, where {index L) computes the set 
{0, . . . , fc}. We have {select L i) =i 3 Sl Si- Thus, an atomic proposition is of form 
{rel i t), with t an argument tuple in 

The constructors V, 3 map propositional functions of type ct — > o to propo- 
sitions of type 0 . This representation has the advantage that binding and pred- 
ication are handled by A-abstraction and A-application. On the object-level, ex- 
istential quantification of x over p (of type o, possibly containing occurrences 
of x) is written as (3 {Xx : (T.p)). Although this representation suffices for our 
purposes, it causes some well-known difficulties. E.g. we cannot write a boolean 
function which recognizes whether a given formal proposition is in prenex nor- 
mal form. As there is no canonical choice of a fresh term in ct, it is not possible 
to recursively descend under abstractions in A-terms. See [NM98, Sections 8.3, 
9.2] for a further discussion. 

For our purposes, a shallow embedding of function symbols is sufficient. We 
have not defined an inductive set term representing the first-order terms in a like 
we have defined o representing the first-order fragment of Instead, ‘meta-level’ 
terms of type a are taken as arguments of object-level predicates. Due to this 
shallow embedding, we cannot check whether certain variables have occurrences 
in a given term. Because of that, e.g., distributing universal quantifiers over 
conjuncts can yield dummy abstractions. These problems could be overcome by 
using de Bruijn-indices (see [dB72]) for a deep embedding of terms in Coq. 

The interpretation function A is a canonical homomorphism recursively de- 
fined as follows. 

{rel i t) ^ {Ri t) 

^Po ^{E po) 

Pi ^ P2^ {E pi) {E P 2 ) 

E : o ^ Xp: o. Cases p of < Pi V P 2 (A pi) V {E P 2 ) 

Pi A p2 ^ {E pi) A {E P 2 ) 

(V Po) ^Wx-.a. {E (po x)) 

. {3 Po) ^ 3x:a. {E {po x)) 

® We require the signature to contain at least one relation symbol. 




Automated Proof Construction in Type Theory Using Resolution 153 

In the above definitions of o, its constructors and of E, the dependency on 
the signature has been suppressed. In fact we have: 

o : ^ {ndist nat) 

rel : Ua:*^. IIL: {ndist nat). Ui: {index L). ^ ^ {o a L) 

E : na:*^. II L \ {ndist nat). {Eli: {index L). ^ ^ *p) — > (o a L)^*p 

In the next section, we fix an arbitrary signature and mention the above depen- 
dencies implicitly only. 



3 Clausification and Correctness 

We describe the transformation to minimal clausal form (see Section 4), which is 
realized on both levels. On the object-level, we define an algorithm mcf : o —)■ o 
that converts object-level propositions into their clausal form. On the meta- 
level, clausification is realized by a term mcfprf, which transforms a proof of 
{E {mcf p)) into a proof of {E p). 

The algorithm mcf consists of the subsequent application of the following 
functions: nnf ,pnf, cnf , skim, duqc, impf standing for transformations to nega- 
tion, prenex and conjunctive normal form, Skolemization, distribution of univer- 
sal quantifiers over conjuncts and transformation to implicational form, respec- 
tively. As an illustration, we describe the functions nnf and skim. 

Concerning negation normal form, a recursive call like 

{nnf E{A A B)) = {nnf EA) V {nnf EB) 

is not primitive recursive, since EA and EB are not subformulae of A(A A B). 
Such a call requires general recursion. Coq’s computational mechanism is higher- 
order primitive recursion, which is weaker than general recursion but ensures 
universal termination. 

The function nnf : o pol — > o defined below^, makes use of the so-called 
polarity of an input formula. Polarities are: pol ’ i q ' ■ 



For Q = V, 3, we write Qx : a. p instead of {Q {Xx : a. p)). 



4 
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nnf : o — > pol ^ o := Xp: o. Xa:pol. Cases p a of 

{rel z i) 0 {rel i t) 

{rd i t) Q ^{rel i t) 

■^Po © ^ {nnf po ©) 

-^Po 0 {nnf Po ©) 

Pi^P 2 ®^ {nnf Pi ©) V {nnf p 2 ©) 
Pi ^ P 2 Q=^ {nnf Pi ©) A {nnf p 2 0) 
Pi V p 2 0 ^ {nnf Pi 0) V {nnf p 2 0) 

Pi V p 2 0 {nnf Pi 0) A {nnf p 2 0) 

Pi A p 2 0 ^ {nnf Pi 0) A {nnf p 2 0) 

Pi A p2 0 {nnf Pi 0) V {nnf p2 ©) 

(V Po) 0 Vx : (J. {nnf (po x) 0) 

(V Po) : a. {nnf (po x) ©) 

(3 Po) 0 ^ 3a; : (J. {nnf (po x) 0) 

, (3 Po) © ^ Va; : (T. {nnf (po x) ©) 

We have proved the following lemma. 

EM — > Vp: 0 . {{E p) {E {nnf p ©))) A {~^{E p) {E {nnf p ©))) 

Where EM is the principle of excluded middle, defined in such a way that it 
affects the first-order fragment only. 

EM : d’ := Vp: o. {E p) V ~^{E p) 

Skolemization of a formula means the removal of all existential quantifiers 
and the replacement of the variables that were bound by the removed existential 
quantifiers, by new terms, that is, Skolem functions applied to the universally 
quantified variables whose quantifier had the existential quantifier in its scope. 
Instead of quantifying each of the Skolem functions, we introduce an index type 
skolT, which may be viewed as a family of Skolem functions. 

skolT : := nat nat — > Iln: nat. ^ a 

A Skolem function, then, is a term (/ i j n) : ^ a, with / : skolT and 

i,j,n : nat. Here, z and j are indices that distinguish the family members. If 
the output of nnf yields a conjunction, the remaining clausification steps are 
performed separately on the conjuncts. (This yields a significant speed-up in 
performance.) Index z denotes the position of the conjunct, j denotes the number 
of the replaced existentially quantified variable in that conjunct. The function 
skim is defined as follows. 

skim : skolT — > nat nat Iln: nat. cr^ ^ o ^ o := 

Xf: skolT. Az, j, n: nat. Xt : ct”. Ap: o. 

Cases p of 

{ (V Po) Vx. {skim f i j (zz 0 1) {insert x n t) (po xf) 

(3 Po) {skim f i {j+l) nt (po {f i j n t))) 

P' ^P' 
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Given a variable a; : ct, an arity n : nat, a tuple t : cr”, the term {insert x n t) 
adds X at the end of t, resulting in a tuple of type Thus, if p is a universal 

statement, the quantified variable is added at the end of the so far constructed 
tuple t of universally quantified variables. In case p matches (3 po), the term 
{f i j n t) is substituted for the existentially quantified variable (the ‘hole’ in 
Po) and index j is incremented. The third case, p', exhausts the five remaining 
cases. As we force input formulae to be in prenex normal form (via the definition 
of mcf), nothing remains to be done. 

We proved the following lemma. 

a AC Vz: nat.yp: o. {E p) ^ 3f: skolT. {E {skim f i 0 0 tt p)) 

Here, a ^ ■ expresses the condition that a is non-empty. AG is the Axiom of 

Choice, which allows us to form Skolem functions. 

AC : := Vof : a — > skolT — > o. 

(yx : a. 3/: skolT. {E {a x f))) 

3F:a skolT. Wx-.a. {E {a x {F a;))) 

Reconsider Figure 1 and substitute mcf for T. We have proved that for all 
objects p : o the interpretation of the result of applying mcf to p implies the 
interpretation of p. Thus, given a suitable signature, from any first-order formula 
we can construct the classical equivalent {E {mcf ip)) G MCF. The term 
mcfprf makes clausification effective on the meta-level. 

mcfprf : EM — > AC ^ <t — > Vp: o. {E {mcf p)) — > {E p) 

Given inhabitants cm : EM and ac : AC, an element s : cr, a proposition 
p : o and a proof p : {E {mcf p)), the term {mcfprf em ac s p p) \s & proof of 
{E p). The term {E {mcf p)) : computes a format Gi ^ ^ G„ ^ T. 

Here Gi , . . . , G„ : are universally closed clauses that will be exported to 

Bliksem, which constructs the A-term p representing a resolution refutation of 
these clauses (see Sections 4 and 5). Finally, p is type-checked in Coq. Section 6 
demonstrates the outlined constructions. 

The complete Coq-script generating the correctness proof of the clausification 
algorithm comprises 3i 65 pages. It is available at the following URL. 

WWW . phil . uu . nl/~hendriks/ claus . tar . gz 



4 Minimal Resolution Logic 

There exist many representations of clauses and corresponding formulations of 
resolution rules. The traditional form of a clause is a disjunction of literals, that 
is, of atoms and negated atoms. Another form which is often used is that of a 
sequent, that is, the implication of a disjunction of atoms by a conjunction of 
atoms. 

Here we propose yet another representation of clauses, as far as we know not 
used before. There are three main considerations. 
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- A structural requirement is that the representation of clauses is closed under 
the operations involved, such as instantiation and resolution. 

- The Curry-Howard correspondence is most direct between minimal logic 
(^,V) and a typed lambda calculus with product types (with ^ as a special, 
non-dependent, case of U). Conjunction and disjunction in the logic require 
either extra type-forming primitives and extra terms to inhabit these, or 
impredicative encodings. 

- The A-representation of resolution proofs should preferably be linear in the 
size of the premisses. 

These considerations have led us to represent a clause like: 

Ti V • • • V Lp 

by the following classically equivalent implication in minimal logic: 

Li > • • • > Tp > T 

Here Li is the complement of Li in the classical sense (i.e. double negations are 
removed). If C is the disjunctive form of a clause, then we denote its implicational 
form by [C] . As usual, these expressions are implicitly or explicitly universally 
closed. 

A resolution refutation of given clauses Ci , . . . , C„ proves their inconsistency, 
and can be taken as a proof of the following implication in minimal logic: 

Cl ^ ^ ^ T 

The logic is called minimal as we use no particular properties of T. We are now 
ready for the definition of the syntax of minimal resolution logic. 

Definition 1. Let \/lt. (j> denote the universal closure of 4>. Let Atom be the 
set of atomic propositions. We define the sets Literal, Clause and MCF of, 
respectively, literals, clauses and minimal clausal forms by the following abstract 
syntax. 



Literal ::= Atom \ Atom T 
Clause ::= T | Literal Clause 
MCF ::= T | (V^. Clause) MCF 

Next we elaborate the familiar inference rules for factoring, permuting and 
weakening clauses, as well as the binary resolution rule. 

Factoring, Permutation, Weakening 

Let C and D be clauses, such that C subsumes D propositionally, that is, any 
literal in C also occurs in D. Let Ai, . . . , Ap, Bi, . . . , Bq he literals (p, q> 0) and 
write 

[C] = Ai^ . Ap ^ T 
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and 

[D]=Bl^ 

assuming that for every 1 < i < p there is 1 < j < g such that Ai = Bj . 
A proof of \C] \D] is the following A-term: 

Ac: [C]. Xbi :Bi... Xbq :Bq. (c it i ... 7Tp) 
with 7Ti = bj, where j is such that Bj = Ai. 



Binary Resolution 

In the traditional form of the binary resolution rule for disjunctive clauses we 
have premisses C\ and C 2 , containing one or more occurrences of a literal L 
and of L, respectively. The conclusion of the rule, the resolvent, is then a clause 
D consisting of all literals of C\ different from L joined with all literals of C 2 
different from L. This rule is completely symmetric with respect to C\ and C 2 . 

For clauses in implicational form there is a slight asymmetry in the formu- 
lation of binary resolution. Let Ai, . . Ap, B\ . . ,,Bq be literals {p, g > 0) and 
write 

[Cl] = Ai ^ • • • — > Ap ^ T, 

with one or more occurrences of the negated atom A ^ T among the Aj and 

[C 2 ] = Bl ^ >Bq^ T, 

with one or more occurrences of the atom A among the Bj . Write the resolvent 
D as 

[D] = Di^ > Dr ^ 1. 

consisting of all literals of Ci different from A — > T joined with all literals of C 2 
different from A. A proof of [Ci] ^ [C 2 ] ^ [D] is the following A-term: 

Aci : [Cl] . Ac 2 : [C 2 ] . Ac?i : I?i . . . \dr : . (ci tti ... 7Tp) 

For 1 < i < p, 7Ti is defined as follows. If A^ yf (A ^ T), then iTi = dk, where k 
is such that Dk = A^. If A^ = A ^ T, then we put 

TTj = Aa: A. (c2 pi . . . Pq), 

with pj (1 < j < q) defined as follows. If Bj yf A, then pj = dk, where k is such 
that Dk = Bj. If Bj = A, then pj = a. It is easily verified that tti : (A ^ T) in 
this case. 

If (A — > T) occurs more than once among the Aj, then (ci tti ... 7Tp) need 
not be linear. This can be avoided by factoring timely. Even without factoring, a 
linear proof term is possible: by taking the following /3-expansion of (ci tti . . . 7Tp) 
(with a' replacing copies of proofs of (A ^ T)): 

{\a ! : A ^ T. (ci 7Ti ... a! ... a! ... 7Tp))(Aa : A. (c 2 pi ... Pq)) 

This remark applies to the rules in the next subsections as well. 
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Paramodulation 

Paramodulation combines equational reasoning with resolution. For equational 
reasoning we use the inductive equality of Coq. In order to simplify matters, 
we assume a fixed domain of discourse a, and denote equality of si, S2 G a by 

Si « S 2 . 

Coq supplies us with the following terms: 
eqrefl : Vs: ct. (s « s) 

eqsubst : Ws: a.WP: a — > {P s) ^Wt:a. {s t) — > {P t) 

eqsym : Vsi, S2 : a. (si « S2) ^ (s2 « si) 

As an example we define eqsym from eqsubst, eqrefl: 

Asi , S2 : cr. Aft-: (si « S2). {eqsubst si (As: cr. (s « si)) {eqrefl si) S2 ft) 

Paramodulation for disjunctive clauses is the rule with premiss C\ containing 
the equality literal ti « t2 and premiss C2 containing literal A[ti] . The conclusion 
is then a clause D containing all literals of Ci different from ti ~t2, joined with 
C2 with A[t2\ instead of 

Let Ai, . . Ap, Bi Bq he literals {p, q> 0) and write 
[Cl] = Ai ^ Ap ^ _L, 

with one or more occurrences of the equality atom ti « ^2 -L among the Aj, 
and 

[C2] = Bl ^ >Bq^ T, 

with one or more occurrences of the atom A[ti] among the Bj. Write the con- 
clusion D as 

[D] = Di^ > Dr ^ -L 

and let I be such that Di = A[t2]. A proof of [Ci] ^ [C2] ^ [D] can be obtained 
as follows: 

Aci : [Cl] . Ac 2 : [C2] . Ac?i : Di . . . Xdr : Dr - {ci tti ... 7 Tp) 

If Ai yf (ti « t2 -L), then tti = dk, where k is such that Dk = A^. If 
Ai = {ti t2 —>■ -L), then we want again that tti : Ai and therefore put 

7Ti = Ae: (tl « t2). (C2 Pi ■■■ Pq). 

If Bj A[ti], then pj = dk, where k is such that Dk = Bj. If Bj = A[ti], then 
we also want that pj : Bj and put (with di : Di) 

Pj = {eqsubst t2 (As:(j. A[s]) di ti {eqsym t\ t2 e)) 

The term pj has type A[ti] in the context e : (ti « t2). The term pj contains 
an occurrence of eqsym because of the fact that the equality ti ps t2 comes in 
the wrong direction for proving A[ti] from A[t2]- With this definition of pj, the 
term tti has indeed type Ai = {t\ ~ t2 ^ -L). 

As an alternative, it is possible to expand the proof of eqsym in the proof of 
the paramodulation step. 
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Equality Factoring 

Equality factoring for disjunctive clauses is the rule with premiss C containing 
equality literals t\ ^ and t\ « and conclusion D which is identical to C 
but for the replacement of « is by t2 9^ ta- The soundness of this rule relies 
on t2 « ta V t2 76 

Let Ai, . . Ap, Bi Bq he literals {p, q> 0 ) and write 

[C] =Ai^ ^ ^ T, 

with equality literals « ^2 ^ -L and « ta ^ T among the Ai. Write the 
conclusion D as 

[D] =Bl^ >Bq^l. 

with Bj) = (ti « t2 — *■ -L) and Bj// = {t2 ~ h). We get a proof of [C] — > [D] 
from 

\c-.[C].\bl\Bl...Xbq\Bq.{cT:i ... ITp). 

If Ai yf {ti « fa ^ T), then tt^ = bj, where j is such that Bj = Ai. For 
Ai = (ti « fa ^ -L), we put 

7Tj = {eqsubst t2 {Xs:a. (ti « s ^ T)) bji ta bj/'). 

The type of tt^ is indeed « ta ^ T. 

Note that the equality factoring rule is constructive in the implicational trans- 
lation, whereas its disjunctive counterpart relies on the decidability of This 
phenomenon is well-known from the double negation translation. 



Positive and Negative Equality Swapping 

The positive equality swapping rule for disjunctive clauses simply swaps an atom 
ti « t2 into t2 ~ ti, whereas the negative rule swaps the negated atom. Both 
versions are obviously sound, given the symmetry of «. 

We give the translation for the positive case first and will then sketch the 
simpler negative case. Let C be the premiss and D the conclusion and write 

[C] = Ai^ >Ap^±, 

with some of the Ai equal to ti « t2 ^ T, and 

[D] = Bi^ iBq^A. 

Let f be such that Bji = (^2 ~ ti ^ T). The following term is a proof of 
[C] - [D]. 

\c-.[C].\bl\Bl...Xbq\Bq.{cT:i ... ITp) 

If Ai yf {ti ~t2 ^ T), then tt^ = bj, where j is such that Bj = Ai. Otherwise 

TTj = Ae: {ti « t2)- {bj' {eqsym t\ t2 e)) 
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such that also 7 Tj : (ti « ^2 ^ -L) = 

In the negative case the literals t\ « t2 in question are not negated, and we 
change the above definition of tt^ into 



7Tj = {eqsym t 2 ti bj>). 

In this case we have bji : {t2 « ti) so that tt^ : (ti « t2) = Ai also in the negative 
case. 

Equality Reflexivity Rule 

The equality reflexivity rule simply cancels a negative equality literal of the form 
t t in a disjunctive clause. We write once more the premiss 

[C] =Ai^ ^ ^ T, 

with some of the Ai equal to t « t, and the conclusion 

[D] = Bi^ 

The following term is a proof of [C] \D ] : 

\c-.[C].\bl\Bl...Xbq\Bq.{cT:i ... ITp). 

If Ai ^ {t K, t), then tt^ = bj, where j is such that Bj = Ai. Otherwise tt^ = 
{eqrefl t). 

5 Lifting to Predicate Logic 

Until now we have only considered inference rules without quantifications. In 
this section we explain how to lift the resolution rule to predicate logic. Lifting 
the other rules is very similar. 

Recall that we must assume that the domain is not empty. Proof terms below 
may contain a variable s : ct as free variable. By abstraction As : ct we will close 
all proof terms. This extra step is necessary since Vs : <t. T does not imply T 
when the domain a is empty. This is to be compared to DT being true in a blind 
world in modal logic. 

Consider the following clauses 

C\ = Vxi , . . .,Xp-.a. [Ai V i?i] 



and 

C2 = yyi,...,yq.(J. [^A2 V R2] 



and their resolvent 

R = \/Zi , . . . , Zr ■ (7. \Ri6i V i? 2 ^ 2 ] 



Here 61 and 62 are substitutions such that AiOi = A2O2 and Z\,...,Zr are 
all variables that actually occur in the resolvent, that is, in Ri 9 i V i?2^*2 after 
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application of 6*i,6*2. It may be the case that XiOi and/or contain other 
variables than Zi,...,Zr] these are understood to be replaced by the variable 
s : a (see above) . It may be the case that 9 \ , 6*2 do not represent a most general 
unifier. For soundness this is no problem at all, but even completeness is not at 
stake since the resolvent is not affected. The reason for this subtlety is that the 
proof terms involved must not contain undeclared variables. 

Using the methods of the previous sections we can produce a proof tt that 
has the type 



[Ai V Rl] 0 i — > [^^2 V i?2]^2 — *■ [Rl 9 l V i?2^2]- 

A proof of Cl ^ C2 — > i? is obtained as follows: 

Xci:C\. Xc2'.C2. Xzi . . . Zr'.a. {tt (ci {xi 9 i) . . . {xp 9 i)) (c2 (2/16*2) ■■■ (2/96*2))) 

We finish this section by showing how to assemble a A-term for an entire res- 
olution refutation from the proof terms justifying the individual steps. Consider 
a Hilbert-style resolution derivation Ci, . . . , Cm, Cm+i, ■ ■ ■ , C„ with premisses 
Cl : Cl, . . . ,Cm '■ Cm- Starting from n and going downward, we will define by 
recursion for every m < k < n a, term tt^ such that 

[Cm-t-1 , • • • , Cfc] . Cn 

in the context extended with Cm+i '■ Cm+i , ■ ■ ■ ,Ck ■ Ck- For k = n we can simply 
take 7T„ = c„. Now assume TTk+i has been constructed for some k > m. The proof 
TTfc is more difficult than tt^+i since tt^ cannot use the assumption Cfc+i : Cfc+i. 
However, Cfc+i is a resolvent, say of Ci and Cj for some i,j < k. Let p be the 
proof of Ci Cj Ck+i- Now define 

[Cm-t-l , • • • , Cfc] — (Ax . Cfc+l -TTk-\-l [Cm-t-1 , • • • , Cfc , x] ) (p Cl Cj ) . Cji 

The downward recursion yields a proof tt^ : C„ which is linear in the size of 
the original Hilbert-style resolution derivation. Observe that a forward recursion 
from m to n would yield the normal form of iTm, which could be exponential. 

6 Example 

Let P be a property of natural numbers such that P holds for n if and only if P 
does not hold for any number greater than n. Does this sound paradoxical? It is 
contradictory. We have P{n) if and only if ^P{n+ l),^P{n + 2 ),^P{n + i ), . . ., 
which implies -^P{n + 2), -^P{n -|- 3), . . ., so P{n -1-1). It follows that ^P{n) for 
all n. However, ^P(O) implies P(n) for some n, contradiction. 

A closer analysis of this argument shows that the essence is not arithmetical, 
but relies on the fact that < is transitive and serial. The argument is also valid 
in a finite structure, say 0 < 1 < 2 < 2. This qualifies for a small refutation 
problem, which we formalize in Coq. Type dependencies are given explicitly. 
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Thus, nat is the domain of discourse. We declare a unary relation P and a 
binary relation <. 

P : nat 
< : nat x nat 

Let L : {ndist nat) := [1, 2] be the corresponding list of arities. The relations are 
packaged by Rel. 

Rel : ni: {index L) . ^ d ^ ■= \i: {index L). Cases i of 

For instance, {E nat L Rel {rel nat L 1 (2,0))) =j 3 Su {Rel 1 (2,0)) =i 3 Sl 2 < 0. 
It is convenient to represent the relations P, < as object-level constants. 

P : nat ^ (o nat L) := {rel nat L 0) 

< : nat x nat — > (o nat L) := {rel nat L 1) 

Let us construct the formal propositions trans and serial, stating that < is serial 
and transitive. (We write n < m instead of (< {n, m)).) 

trans : (o nat L) := \/x, y, z: nat. {x<yAy<z)^x<z 
serial : {o nat L) := Mx : nat. 3y. nat. x < y 

We define foo. 

foo : {o nat L) := Vx : nat. {P x) (Vy : nat. x < y ^ F(P y)) 

Furthermore, we define taut on the object-level, representing the example infor- 
mally stated at the beginning of this section. (If the latter is denoted by (p, then 
taut = (p.) 

taut : (o nat L) := {trans A serial) ^ ^foo 

Interpreting taut, that is /3(5t-normalizing {E nat L Rel taut), results in Haut 
without dots’. 

We declare em : {EM nat L Rel), ac : {AC nat L Rel) and use 0 to witness 
the non-emptiness of nat. We reduce the goal {E nat L Rel taut), using the 
result of Section 3, to the goal {E nat L Rel {mcf nat L taut)). If we prove this 
latter goal, say by a term p, then 

{mcfprf nat L Rel em ac 0 taut p) : {E nat L Rel taut) 

We normalize the new goal: 

{E nat L Rel {mcf nat L taut)) =psL 
V/ : {skolT nat). 

(Vx, y, z: nat. x<y^y<z^{x<z^l.)^l.) 

{Wx : nat. (a; < (/ 1 0 1 a;) ^ T) ^ T) 

^ (Va;: nat. (a; < (/ 2 0 1 a;) ^ T) — > {{P a;) ^ T) ^ T) 

^ {Wx:nat.{{P (/ 2 0 1 a;)) ^ ±) ^ {{P a;) ^ T) ^ ±) 

(Va:, y: nat. {P x) ^ x < y ^ {P y) ^ 1.) 

T 



(0=> P 
\ 1 ^< 
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This is the minimal clausal form of the original goal. We refrained from exhibiting 
its proof p for reasons of space. The Coq-script generating p can be found in 
example . v in the tar file mentioned at the end of Section 3. 
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1 Introduction 

This is a brief update on the Tps automated theorem proving system for clas- 
sical type theory, which was described in [3]. Manuals and information about 
obtaining Tps can be found at http : // gtps . math . emu . edu/ tps . html. 

In Section 2 we discuss some examples of theorems which Tps can now prove 
automatically, and in Section 3 we discuss an example which illustrates one of 
the many challenges of theorem proving in higher-order logic. We first provide a 
brief summary of the key features of Tps . 

Tps uses Church’s type theory [8] (typed A-calculus) as its logical language. 
Wffs are displayed on the screen and in printed proofs in the notation of this 
system of symbolic logic. 

One can use Tps in automatic, semi-automatic, or interactive mode to con- 
struct proofs in natural deduction style, and a mixture of these modes of oper- 
ation is most useful for significant applications. Our current research is focused 
primarily on increasing the power of the purely automatic search procedures, 
since these are useful in speeding up the construction of proofs even if many of 
the key ideas must be supplied interactively. 

When searching for a proof of a theorem, Tps first tries to find an expansion 
proof [11], of which an important component is a mating [1] (otherwise known as 
a spanning set of connections [4]). Various search procedures are implemented in 
Tps, most notably those described in [6], [5], [10], and [9]. The method of dual 
instantiation of definitions discussed in [7] is also implemented in Tps . Once an 
expansion proof has been found, it is translated into a natural deduction proof 
by the methods of [13] and [14]. 

Many aspects of the behavior of Tps can be varied by changing the settings 
of flags. These flags provide a convenient facility for exploring various aspects of 
the problem of searching for proofs, and are essential in setting bounds for the 
many dimensions of proof search in higher-order logic. 

* This material is based upon work supported by the National Science Foundation 
under grant CCR-9732312. 
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2 New Theorems 

In the notation used by Tps , o is the type of truth values, i is the type of 
individuals, and (a/3) (which some authors prefer to write as {(3 a)) is the 

type of functions from elements of type (3 to elements of type a. An entity of 
type ( oq :) is regarded as a set of elements of type a, and foaXa can be interpreted 
as meaning that Xa is in foa- l(3a is an abbreviation for ((7/3)a). 

A dot stands for a left bracket whose mate is as far to the right as is consistent 
with the pairing of brackets already present. 

The following theorems were all proven completely automatically by Tps once 
the flags were set. All the timings quoted below represent the internal runtime, 
excluding garbage-collect time, used by Tps to find an expansion proof and trans- 
late it into a natural deduction proof on a Tangent workstation with a Pentium 
III processor and 512 megabytes of RAM using Allegro Common Lisp 5.0 for 
Linux. The numbers are useful only for their approximate magnitudes. They 
may not represent optimal settings of the flags, and the times required to prove 
these theorems will probably increase as ways are found to move more of the 
burden of setting flags from users to Tps . 

We start with several theorems concerned with various formulations of the 
Axiom of Choice.^ We first list these formulations. In [15] these are presented 
as statements of axiomatic set theory; in a type-theoretic context their variables 
must be given types, and they take the form of axiom schemas. Logical relations 
between these formulations of the Axiom of Choice are then complicated by the 
need to have appropriate relations between the types which are involved. 

ACl(/3) from [15] : Vs„(o^) [sA D 3y^Ay] D 3/^(„^)VA.sA D X.fX 
If s is a set of non-empty sets, there is a function / such that for every x G 
s, f{x) G X. 

AC3(/3, a) from [15] : Vr(o^)Q3g^aVa;a.3j/^ra;y D rx.gx 
For every function r, there is a function g such that for every x, if x is in the 
domain of r and r(x) yf 0, then g(x) G r(x). (In a set-theoretic context it may 
be assumed that the values of r are sets, but in a type-theoretic context r must 
be given a type compatible with this assumption.) 

AC17(a) from [15] : ygoa(a{oa))-'^ha(oa)^Ua[gh]u D ^fa{oa)gf-f-gf 
If s is a set (which we represent as the set of all elements of type a), t is the 
collection of all non-empty subsets of s, F is the set of all functions (which must 
have type (a(oa))) from t to s, and g is a function from F to t, then there is an 
f G F such that f{g{f)) G g{f). 

AC(a) from [2] : 3/„(o„)VAo„.3t„At D X.fX 
There is a universal choice function / (for elements of type a) such that if X is 
any non-empty set (whose elements are of type a), then fX G X 

THM532: ACl(/3) D AC3(/3, a) (3.96 seconds) 

THM533: AC3(o;,oa) D ACl(a) (1.92 seconds) 

THM560: AC3(o:,oa) = ACl(a) (19.26 seconds) 

^ See [12] for a discussion of interactive proofs of many similar theorems. 
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THM534: ACl(a) D AC17(a) (4.71 seconds) 

THM541: AC(a) = ACl(a) (5.00 seconds) 

THM531E: FINITE-SET Coa A Boa C C D FINITE-SET B 

(21.50 seconds) 

THM53IE says that a subset of a finite set is finite. FINITE-SET is defined as 
[AAoaVPo(oa).VEoa[~ 3taEt D P E] AyYoc(7 X c(7 Z oa[[PY A . Z C .Y + x] D PZ] D 
PA] , which is one of several ways one can define finiteness inductively. Too + Xa 
is defined as [Xta.Yoat V f = Xa], which is another notation for Yoa U {xa}- 

THM196B: ~ [a, = b)\ D ~ Vj,,Vfc,,.ITERATE-h j[koj] D ITERATE-h jk 

( 1.41 seconds) 

ITERATE-!- is defined as A/aaA 5 aaVpo(aa)P/ A Vjaabj 3 p.foj] D pg, and so 
ITERATE-!- fg means that g is an iterate of / — i.e., a function of the form 
/ o ... o /. The symbol o denotes the composition of functions, and is defined as 
XfapXg/s^Xxjf.gx. The theorem refutes the conjecture that if fcoj is an iterate 
of j, then k must be an iterate of j. Of course, the conjecture is trivially true 
if there is just one individual, so the theorem depends on the assumption that 
there are two distinct individuals. Tps proves the theorem by constructing the 
simple counterexample where k is the identity function and j is the constant 
function whose value is b. The proof consists of a verification that this is indeed 
a counterexample. 

THM563: CLOS-SYSl .XWo/s.W Of) AW XfiW yglW y A x < yD W x] A 
x\/ y\l Zf.W X A W y A JOIN xyz D Wz (49.2 minutes) 

THM563 states that the collection of sets Wo /3 which contain an element 0, are 
downward closed with respect to a binary relation <, and are closed with respect 
to a tertiary relation JOIN, is a closure system. 

CLOS-SYSl is defined as XCLo(op)^ So(op)- S C CL D CL.f] S, and p| is 
defined as XSo{oi 3 )Xx/s'iWo/ 3 . SW D W a;, so a closure system is a collection of 
sets closed under arbitrary intersections. 

THM563 is very general, but we can illustrate it with the following special 
case. Let (3 be the type of finite binary trees. We use 0 as a name for the tree 
with a single node. We can define a partial ordering < on this type by saying 
a tree x is less than a tree y if we can replace the leaves of x by some trees to 
obtain y. Thus, 0 is the smallest member of (3. Given trees x and y, there is a 
tree called [a; V y] which is the lub{x, y} such that JOIN x y [x V yj. We can 
represent each infinite binary tree by a certain set W 0/3 of finite binary trees, 
which approximate the infinite tree in the sense illustrated below, where the 
finite trees approximate the infinite tree on the right. 

0 < r'^< ■■■ 

0 00 000 000 

(We can regard finite binary trees as special cases of infinite binary trees. A finite 
binary tree Xf when considered as an infinite binary tree is the set {y/ 3 \y < a;}. 
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an object of type (o/3).) A set Wo /3 that represents a tree contains the tree 0, is 
downward closed, and is closed with respect to joins. In the case of this example, 
THM563 shows that the set of infinite trees constitutes a closure system, and 
therefore forms a complete lattice under the subset ordering. 

X5204: #/a/3[U^«o(o/3)] = U ■#[#/]w' f33J6 seconds) 

# is defined as Xfa/sXxofsXza^t/s.xt A z = ft, and so fffajsXofS is the image of 
the set Xoi 3 under the function fafs- This is a polymorphic definition, and the 
instances of # in the theorem have appropriate types attached to them. IJ is de- 
fined as XDo(oa)Xxa^Soa-DS A Sx; hence IJ Do(oa) is the union of the collection 
Do{oa) of sets. 

X5311A-EXT: Vya[r[= y] = y] A\/poayqoa[yxa[px = qx] D '^ro(oa).rp D 
rq\ D D p.Lp (3.0 minutes) 

is defined as Xpoa^ya-Py A 'iza-pz D y = z, and represents the property of 
being a one-element set. This is essentially theorem 5311 from [2]. In order to 
prove it one needs axioms of descriptions and extensionality, so they are made 
antecedents of the main implication. The theorem says that if p is a one-element 
set, then the description operator t. maps p to the unique entity which is in p. 



3 A Challenge 

We conclude with a discussion of an example which poses a significant challenge 
for Tps and other theorem provers for higher-order logic and set theory. Cantor’s 
theorem for sets says that if U is any set and W is its power set, then W has 
larger cardinality than U. This is usually expressed by saying that there is no 
surjection from U onto W. If one takes the members of U as the set of individuals, 
the theorem can be expressed simply by the wff ~ ^goii^ foL^ji-gj = f, which 
we called X5304 in [2] and [3]. Tps has been able to prove this for many years. 
However, one can also express the fact that W has larger cardinality than U by 
saying that there is no injection from W into U, which we formalize as follows: 

X5309: ~ 3hpoL)^PoL^qoL-hp = hq D p = q (not proven) 

We call this the Injective Cantor Theorem. 

Here is an informal proof of this theorem. Suppose there is a function h : 
W ^ U such that (1) h is injective. Let (2) D = {ht \ t and ht ^ t}. Note 
that (3) D G W. Now suppose that (4) hD S D. Then (by 2) there is a set t 
such that (5) t G W and (6) ht ^ t and (7) hD = ht. Therefore (8) D = t (by 
1, 7), so (9) hD ^ D (by 6, 8). This argument (4-9) shows that (10) hD ^ D. 
Thus (11) hD G D (by 2, 3, 10). This contradiction shows that there can be no 
such h. 

It is easy to prove parts of this argument automatically. Define IDIAG to 
be ~ s.hs]. Then [IDIAG h] represents the set D of the infor- 

mal argument above. Tps can automatically prove the following theorems, from 
which X5309 follows trivially: 
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THM143B: = /ig D p = g] D ~ IDIAG /lA.IDIAG h 

(3.35 seconds) 

THM144B: .IDIAG h.h.lBlAG h (0.)7 seconds) 

However, a completely automatic proof of X5309 seems well beyond the 
present capabilities of Tps . The expansion proof which corresponds to the argu- 
ment above involves instantiating a quantifier on a set variable with a wff which 
contains another quantifier on a set variable, which must also be instantiated 
with a wff which contains a quantifier. We may say that such an expansion proof 
has quantificational depth 3. Thus far Tps has found expansion proofs only of 
quantificational depth < 2. 

We may define the quantificational depth of a theorem to be the minimum 
of the quantificational depths of its expansion proofs. (Thus all theorems of 
first-order logic have quantificational depth at most 1). Research on methods 
of proving theorems which are deep in this sense should stimulate significant 
progress in higher-order theorem proving.^ 
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Abstract. The NuprI system is a framework for reasoning about mathe- 
matics and programming. Over the years its design has been substantially 
improved to meet the demands of large-scale applications. NuprI LPE, 
the newest release, features an open, distributed architecture centered 
around a flexible knowledge base and supports the cooperation of inde- 
pendent formal tools. This paper gives a brief overview of the system 
and the objectives that are addressed by its new architecture. 



1 Introduction 

The NuprI proof development system [C"*“86] is a framework for the development 
of formalized mathematical knowledge as well as for the synthesis, verihcation, 
and optimization of software. The original system was based on a signihcant 
extension of Martin-Lbf’s intuitionistic Type Theory [ML84], which includes 
formalizations of the fundamental concepts of mathematics, data types, and 
programming. The system itself supports interactive and tactic-based reason- 
ing, decision procedures, evaluation of programs, language extensions through 
user-dehned concepts, and an extendable library of verihed knowledge from 
various domains. Since its hrst release in 1984 it has been used in increas- 
ingly large applications in mathematics and programming, such as verihca- 
tions of a logic synthesis tool [AL93] and of the SCI cache coherency protocol 
[How96] as well as the verihcation and optimization of group communication 
systems [KHH98,Kre99,L+99]. 

Over the years it has turned out that the rapidly growing demands for formal 
knowledge and tools cannot be met by a single closed system anymore. Auto- 
matic tools such as decision procedures, fully automatic theorem provers, proof 
planners, rewrite engines, model checkers, and computer algebra systems have 
been very successful in their respective areas, but have limited application do- 
mains. Proof assistants like NuprI, Isabelle [Pau90], HOL [GM93], PVS [0+96], 
and i?mega [B+97] are more general but at a lesser degree of automation. Each 
of these systems has accumulated a substantial amount of formalized knowledge 
in its respective formalism, but no system contains all the currently available 
formal knowledge. A variety of user interfaces have been developed for these 
systems each with its own strengths and weaknesses. 

* Part of this work was supported by DARPA grant F 30620-98-2-0198 

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 170-176, 2000. 
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These observations led to an entirely new design of the NuprI system that 
shall replace the monolithic architecture of current theorem proving environ- 
ments. The NuprI LPE (logical programming environment) is an open, distributed 
architecture that integrates all its key subsystems as independent components 
and, by using a flexible knowledge base as its central component, supports the 
interoperability of current proof technology. 

In the following we shall briefly discuss the key issues that shall be addressed 
by NuprI LPE and describe its architecture as well as the available components. 

2 Design Objectives 

Besides preserving and expanding the strengths of the existing NuprI system, the 
new design of the NuprI LPE is based on the following objectives. 

Interoperability: The NuprI LPE shall provide a platform for the cooperation 
of proof systems and a common knowledge base that makes formal theories 
available to the individual systems. Special support for computational logics 
shall be olfered, but other logics shall be accommodated as well. 
Optimization and Productivity: To optimize software reuse and system 
maintenance, the key components of the NuprI LPE have to be independently 
operating programs that communicate using a protocol. This will increase 
the system’s productivity, as several inference engines can be run in parallel 
or even olf-line while the user continues to work on other proof goals. 
Accountability: As the NuprI LPE shall accomodate a variety of logics, there 
cannot be an absolute notion of correctness anymore. Instead, the extent 
to which one may rely upon formalized knowledge in the library must be 
accounted for. Just iflcat ions for the validity of proofs depend upon what 
rules and axioms are admitted and on the reliability of the inference engines 
employed. The design has to make sure that such information can be easily 
exposed to determine which proofs are valid in a particular logic. 
Information Preservation: The system has to make sure that knowledge can- 
not be destroyed or corrupted if a user erroneously overwrites a proof or if 
the system crashes before a proof could be saved. The system must guarantee 
that such information can always be recovered. 

Large Scale Object Management: The system should use abstract object 
references rather than traditional naming schemes. This is invaluable for 
merging mass libraries, where name collisions are inevitable, and also for 
performing context-speciflc tasks. 

3 The NuprI LPE Architecture 

Figure 1 illustrates the distributed open architecture of NuprI LPE. The system 
is organized as a collection of communicating processes that are centered around 
a common knowledge base, called the library. The library contains deflnitions, 
theorems, inference rules, meta-level code (e.g. tactics), and structure objects 
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Fig. 1. Nuprl LPE distributed open architecture 



that can be used to provide a modular structure for the library’s contents. In- 
ference engines (refiners), user interfaces (editors), rewrite engines (evaluators) , 
and translators are started as independent processes that can connect to the 
library at any time. 

The library can communicate with arbitrarily many other processes. This al- 
lows the user to connect several refiners and evaluators simultaneously, e.g. the 
Nuprl and MetaPRL [Met] rehners, major systems like HOL, PVS, l?mega, or 
SPECWARE [SJ95], decision procedures, hrst-order provers, Mathematica [W0I88], 
and the Maude [C"*“99b] rewrite engine, and to have them cooperate through the 
library, which stores the formal knowledge required by these tools. It is also 
possible to run different rehners in parallel on the same proof goal or several 
instances of the same rehner on different proof goals. 

Providing several editors enables several users to work in parallel on the same 
formal theory while using their favorite interface. At the same time external users 
can access the system through the Web without having to restart the whole 
system, as one would have to do in monolithic architectures. 

Translators between the formal knowledge stored in the library and, for in- 
stance, programming languages like Java or OcamI [Kre97,KHH98] allow the for- 
mal reasoning tools to supplement real-world software from various domains and 
thus provide a logical programming environment for the respective languages. 

The Nuprl LPE provides special support for computational and constructive 
logics but it can accommodate other logics equally well. Its open architecture 
makes it possible that different systems with different formalisms and represen- 
tation structures cooperate through a common knowledge base that can store all 
these informations. It is obvious that translations between different formalisms 
need to be developed to make such a cooperation possible and that several the- 
oretical issues need to be addressed for each of them (see e.g. [How96,FH97]). 
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But the NuprI LPE provides the necessary infrastructure for these translations. 
They only have to operate on the formal knowledge stored in the library and 
can be provided as independent external processes that are invoked as neces- 
sary. Translations can also be used in a transitive fashion, e.g implementing the 
translation between Maude and NuprI [C"*“99a] automatically gives us access to 
all formalisms that have been translated into Maude. 

In the current standard configuration, which we call NuprI 5, the system es- 
sentially provides an extended functionality of the NuprI 4 system [Jac94]. It 
consists of the library, the NuprI 5 editor, and the NuprI 5 refiner. The library 
contains all the definitions, theorems, inference rules, and tactics of NuprI 4 as 
well as the structure objects that emulate the NuprI 4 system architecture. The 
NuprI 5 editor is capable of interpreting these structure objects while displaying 
and editing proofs and terms. The NuprI 5 refiner is able to interpret inference 
rules and the ML code of the tactics. In the following we will describe the indi- 
vidual components more specifically. 

The Knowledge Base. The knowledge base is based on a transaction model 
for entering and modifying objects. Changes to objects, e.g. the elfects of editor 
commands or inference steps, are immediately committed to the library. This 
makes sure that knowledge doesn’t get lost in case of a system failure, which 
could happen in systems that keep newly developed knowledge in memory until 
it is explicitly saved to disk. The knowledge base also provides the option to 
undo changes, redo transactions, or to have several processes view or work on 
the same object - essentially following the same protocols as databases. 

However, changes do not overwrite an object but instead create a new version. 
The previous version is preserved until it is explicitly destroyed in a garbage col- 
lection process. A version control mechanism allows the user to recover previous 
versions of an object. This protects user data from being corrupted or destroyed 
erroneously and enables a user to create several proofs of the same theorem. 

To account for the validity of library objects, the knowledge base supports 
dependency tracking. For this purpose a variety of information is stored together 
with an object, e.g. the logical rules and theorems on which it depends, the 
exact version of the refiners that were used to prove it, timestamps, etc. This 
information will help a user to validate theorems that rely on knowledge created 
by several systems, provided that the conditions for hybrid validity wrt. the 
underlying logics are well understood and stored in the library. For instance, a 
theorem referring to lemmata from NuprI (constructive type theory) and HOL 
(classical higher order logic) would be marked as constructively valid if the HOL 
theorems involve only decidable predicates. 

Apart from abstract links from terms to library objects the library does 
not impose any predefined structure. All visible structure, e.g. the directory 
structure as observed by the NuprI 5 editor, is generated by structure objects 
that are explicitly present in the library. This allows exploiting the structure 
of the library and modifying it without having to change the representation of 
already stored knowledge. Structure, as we understand it, is only a matter of 
external presentation, not of internal representation. 
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For the same reason, there is a separation between objects and the names 
that users choose to denote them. Technically, the visible name is just a display 
version of the internal name, which makes it possible to chose the same name 
for dilferent objects without creating internal name clashes and to disambiguate 
the display as needed. 

The absence of a predehned library structure is a prerequisite for integrat- 
ing formal knowledge from other systems besides NuprI without requiring these 
systems to change their representation structure. The “only” thing that needs 
to be done is to emulate this structure in Nuprl’s knowledge base. 

User Interfaces. The main user interface of NuprI 5 is the NuprI 5 navigator. 
Its communication with the knowledge base is based on sending and receiving ab- 
stract terms. While displaying and editing terms it interprets the corresponding 
structure objects and displays them as directories, theorems, dehnitions, proofs, 
or mathematical expressions, sometimes opening new windows for this purpose. 
For the user, it provides the functionality of a structure editor: the user can mark 
subterms and edit slots in the displayed term and then cause the navigator to 
send the result back to the library, which processes the result while the user may 
continue to work with the editor. 

The meaning of the abstract terms received by the knowledge base is de- 
termined by the structure objects already present. It could be interpreted as a 
command to store the term, to execute a tactic which subsequently calls one or 
several rehners, to open a proof editor window, or to send another term to be 
displayed. Obviously, this process involves a lot of management information in 
the terms being sent that is usually not shown to the user. However, the user 
has the right to edit (almost) all structure objects as well and thus customize 
the appearance of the information presented by the editor. 

The NuprI 5 editor is capable of interpreting objects as commands. A very 
convenient feature resulting from that is a hyperlink mechanism where clicking 
on that term causes the corresponding object to be raised, but more general 
ML expressions can be executed as well. This enables the user to trace back 
dehnitions of logical expressions and tactics or to customize the editor by adding 
buttons for common commands. 

In addition to the NuprI 5 navigator, NuprI 5 provides emulations of the edi- 
tors used in the previous release of NuprI in order to ensure upward compatibility, 
as well as valuable extensions for facilitating proof browsing, merging, replaying 
and accounting. There is also a web front end [Nau98] that allows external users 
to browse the NuprI library without having to install the whole system. 

Inference Engines. The NuprI 5 inference engine rehnes proof goals by exe- 
cuting ML code that may include references to library objects, particularly to the 
inference rules and tactics stored in the knowledge base. It applies the code to a 
given proof goal that it receives as an abstract term and returns the resulting list 
of subgoals back to the library. In the process it may invoke decision procedures 
and proof checkers. Based on the validations given in the rule objects it can also 
extract programs from proofs and evaluate them. The inference mechanism is 
fairly straightforward and compatible with the one in NuprI 4. 
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As an alternative one may invoke the MetaPRL refiner [Met], a modularized 
version of Nuprl’s inference engine implemented in OCamI, that can run up to 
100 times faster due to improvements in rewriting and evaluation. The commu- 
nication between NuprI LPE and MetaPRL utililizes the MathBus design [Mat], 
We are currently working on connecting a variety of external refiners such 
as a constructive first-order theorem prover [K+00], the HOL system (via Maude 
[C"*“99a]), Mathematica, and Isabelle [Nau99j. We will also emulate the refiner of 
NuprI 3 in order to be able to restore older theories that had not been migrated 
during the transition to NuprI 4. 

4 Progress and Availability 

NuprI LPE is the result of more than 25 years of experience with mathematical 
proof assistants. We have completed the implementation of the basic NuprI LPE 
system, which provides a platform for the cooperation of a variety of proof 
systems through its open distributed architecture. 

Using the NuprI LPE infrastructure we have implemented the NuprI 5 system 
consisting of the NuprI 5 editor, refiner, evaluator, and the NuprI 5 core library. 
The latter contains the terms and rules of the NuprI type theory as well as 
the standard theories and tactics, which were migrated from NuprI 4 to NuprI 5. 
Besides the NuprI 5 refiner a user may also invoke the MetaPRL refiner. 

In addition to that we have migrated most of the user-defined theories from 
NuprI 4 to NuprI 5 and are currently testing the behavior of the new system under 
large scale applications like the verification of communication systems. Experi- 
ence shows that the separation of library, editor, and refiner makes NuprI 5 more 
efficient than its predecessor NuprI 4, because it can take advantage of multipro- 
cessor machines or a network of computers and run several competing refiners to 
solve a goal. We also observed an increased productivity of the system’s users, 
who can now work on other tasks while a refiner solves a complex goal. 

The completion of the basic NuprI LPE system enables us to increase the 
system’s capabilities by adding new editors, refiners, evaluators, or translators 
to the system. The open distributed architecture opens the door for a variety of 
research topics which can be turned into practically useful components as soon 
as their theoretical background has been explored. 

NuprI LPE is written mostly in Common Lisp, but uses some extensions that 
require Lucid or Allegro Lisp. An executable copy running under Linux is avail- 
able at 

http : / / WWW . cs . Cornell . edu/ Inf o/Pro j ect s/NuPrl/ nuprl5/ index . html. 
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Abstract. aRa is an automatic theorem prover for various kinds of re- 
lation algebras. It is based on Gordeev’s Reduction Predicate Calculi 
for n- variable logic (RPCn) which allow first-order finite variable proofs. 
Employing results from Tarski/Givant and Maddux we can prove va- 
lidity in the theories of simple semi-associative relation algebras, rela- 
tion algebras and representable relation algebras using the calculi RPCa, 
RPC 4 and RPC,^. aRa, our implementation in Haskell, offers different 
reduction strategies for RPC„, and a set of simplifications preserving 
n- variable provability. 



1 Introduction 

Relations are an indispensable ingredient in many areas of computer science, 
such as graph theory, relational databases, logic programming, and semantics of 
computer programs, to name just a few. So relation algebras - which are exten- 
sions of Boolean algebras - form the basis of many theoretical investigations. As 
they can also be approached from a logic point of view, an application of ATP 
methods promises to be beneficial. 

We follow the lines of Tarski [TG87], Maddux [Mad83] and Gordeev [Gor95] 
by converting equations from various relation algebraic theories to finite variable 
first-order logic sentences. We are then able to apply Gordeev’s n- variable calculi 
RPG„ to the transformed formulae. 

Our implementation aRa is a prover for the RPG„ calculi with a front-end 
to convert relation algebraic propositions to 3-variable first-order sentences. It 
implements a fully automatic proof procedure, various reduction strategies, and 
some simplification rules, to prove theorems in the theories SSA^, RA, and RRA. 

2 Theoretical Foundations 

Gordeev’s Reduction Predicate Calculi. Gordeev developed a cut free for- 
malization of predicate logic without equality using only finitely many distinct 

* This work was partially supported by DFG under grant Ku 966/4-1. 

^ This “simple” variant of semi-associative relation algebra (SA) includes the identity 
axiom A © 1 = A only for literals A. 
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variables. In [Gor95] the reduction predicate calculi RPC„ were introduced. 
These term rewriting systems reduce valid formulae of n-variable logic to true, 
n- variable logic comprises only formulae containing no more than n distinct vari- 
ables, the same restriction applies to RPC„. Moreover, all RPC„ formulae are 
supposed to be in negation normal form. 

Formulae provable in RPC„ are exactly those provable in the standard modus 
ponens calculus, or, alternatively, the sequent calculus with cut rule, using at 
most n distinct variables ([Gor95], Theorem 3.1 and Gorollary). The RPG„ 
rewriting systems consist of the following rules: ^ 



(Rl) 


AVT - 


T 


(R2) 


L V ~^L — 


T 


(R3) 


aat - 


A 


(R4) 


1 

< 

> 


-^(Avb)a(Avc) 


(R5) 


3a;A - 


— > 3xA V A[x/t] 


(R6) 


yxA \J B — 


-> \/xA V R V 'iy{WyB V A[x — y][x/y]) 


(R60 


VxA — 


-> VxA V A[— a;] V 'iy{A[x—y][x/y]) 



Here, L denotes an arbitrary literal, H, B and C are formulae, x and y are 
individual variables, and t is any term. A term may be a variable or a constant, 
function symbols do not occur in RPG„. F\x/t] denotes substitution of x by t, 
where F[x/t\ does not introduce new variables. A[— a;] and F[x — y\ are variable 
elimination operators (see [Gor95] for a rigorous definition), where A[— a;] deletes 
all free occurrences of x from F by replacing the respective (positive and neg- 
ative) literals contained in F by falsum (T). The binary elimination operator 
F[x — y] is defined by F[x — y] = F[—y] if a; yf y, and F[x — y] = F otherwise. 
Note, that the variable elimination operators are - just like substitution - meta- 
operators on formulae, and not part of the formal language of the logic itself. 

A formula F is provable in RPG„ (RPG„ h F) iff it can be reduced to true, 
i.e., iff F T. We call a formula sequence (Fq, . . . , F„) a reduction chain 

if Fi — Fi+i. A reduction strategy is a computable function that extends a 
reduction chain by one additional formula. As all essential rules of RPG„ are of 
the form F — > F \/ G, and thus no “wrong” reductions are possible, strategies 
can be used instead of a search procedure. This also means that no backtracking 
is needed and the RPG„ calculi are confluent on the set of valid formulae of 
n-variable logic. 

aRa implements various reduction strategies rather than unrestricted 
breadth- first search or iterative deepening. 



Translation from Relation Algebra to Predicate Logic. In order to prove 
formulae in the theory of relation algebra, we follow the idea of [TG87] and 
transform sentences of relation algebra to 3-variable first-order sentences. The 
transformation Txyz is straightforward, where x and y denote the predicate’s 
arguments and 2 is a free variable. For brevity, we give only part of the definition 

^ V and A are supposed to be AC-operators. 
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of Txyzi for predicate symbols, relative and absolute product (© and •): 

Txyz{R) = xRy 

Xxyzi^^ O — dz(Ta;2;y(^) A Tzyxi}^)) 

Txyzi^ ■ 'R) — '^xyz ^ '^xyz m 

Here, and S' stand for arbitrary relation algebraic expressions. A sentence = 
S' from relation algebra is then translated by r into an equivalence expression 
of first-order logic, a relational inclusion <P < 'I' into an implication: 

= If) = ^xiyiTxyzi.'^) ^ Txyzi.'P)) 
t{<P < If) = yx'iy{Txyz{‘^) Txyzi'R)) 

We can simulate proofs in various theories of relation algebra by using the 
following tight link between n-variable logic and relation algebras proved by 
Maddux (see, e.g., [Mad83]): 

1. A sentence is valid in every semi- associative relation algebra (SA) iff its 
translation can be proved in 3-variable logic. 

2. A sentence is valid in every relation algebra (RA) iff its translation can be 
proved in 4-variable logic. 

3. A sentence is valid in every representable relation algebra (RRA ) iff its trans- 
lation can be proved in w-variable logic. 

We have to restrict SA in case of 3-variable logic to its simple variant SSA, 
as RPC„ - as well as the more familiar Hilbert-Bernays first-order formalism - 
contains only the simple Leibniz law. Compared to the generalized Leibniz law 
used in Tarski’s and Maddux’s formalisms, this simple schema is finitely (i.e., as 
an axiom) representable in RPC„. The corresponding refinement is not necessary 
in case of RA and RRA, as the generalized Leibniz law for 3- variable formulae is 
deducible from its simple form in RPC 4 , and thus in RPC^j (see, e.g., [Gor99]). 

3 Implementation 

The aRa prover is a Haskell implementation of the RPC„ calculi with a front 
end to transform relation algebraic formulae to first-order logic. It offers different 
reduction strategies and a set of additional simplification rules. Most of these 
simplification rules preserve n-variable provability and can thus be used for SSA- 
and RA-proofs. 



Input Language. The aRa system is capable of proving theorems of the form 
{ifi, . . .En] h E, where E and Ei are relation algebraic equations or inclusions.^ 
Each equation in turn may use the connectives for relative and absolute sum 

® Instead of an equation E a formula F may also be used, which is interpreted as the 
equation F = 1, where 1 stands for the absolute unit predicate. 
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and product, negation, conversion, and arbitrary predicates, among them the 
absolute and relative unit and the absolute zero as predefined predicates. 

The translation of a relational conjecture of the form {Ei, . . .En} \~ E to 
first-order logic is done in accordance with Tarski’s deduction theorem for 
([TG87], 3.3): 



t{{Ei, ...Er^}hE) = t{Ei) a ... a r(K) ^ t{E) 

To give an impression of how the actual input looks like, we show the repre- 
sentation of Dedekind’s rule {Q Q R) ■ S < {Q ■ {S Q R"")) © {R ■ {Q'" © S)) as a 
conjecture for aRa: 

I- (Q@R)*S < (Q*(S@R~))@(R*(Q~@S)) ; 



Literal and Reduction Tracking. To guarantee completeness of the deter- 
ministic proof search introduced by reduction strategies, we employ the technique 
of reduction tracking in our implementation. The idea is as follows: While suc- 
cessively constructing the reduction chain, record the first appearance of each 
reduction possibility^ and track the changes performed on it. A strategy is com- 
plete, if each reduction possibility is eventually considered. 

Literal tracking is used by the LP reduction strategy described later. It keeps 
track of the positions of certain literal occurrences during part of the proof. 



Reduction Strategies. We implemented a trivial strategy based on the re- 
duction tracking idea described above and several variants of a literal tracking 
strategy. The latter select a pair of complementary literals (and therefore are 
called LP strategies) that can be disposed of by a series of reduction steps. In 
order to find such pairs, an equation system is set up (similar to unification) 
that is solvable iff there is a reduction sequence that moves the literals (or one 
of their descendants) into a common disjunction. Then, RPC reductions are 
selected according to the equation system to make the literal pair vanish. 



Additional Simplification Rules. To improve proof search behavior we added 
some simple rules and strategies to the RPC„ calculus that preserve n-provability: 

1. Give priority to shortening rules (Rl), (R2) and (R3). 

2. Remove quantifiers that bind no variables. 

3. Minimize quantifier scopes. 

4. Subgoal generation: To prove E A G prove first E, and then G. 

5. Additional V-rule: tJxAy B — > yy{A[x/y\ \J B) if y ^ Fr(A) UFr(i?), which 
is used with priority over (R6) and (R6'). 

6. Delete pure literals. 

7. Replace E \/ E resp. F A F by F, if F is a bound renaming of F. 

^ A reduction possibility consists of a position in the formula and, in case of the 
RPC-rules (R5), (R6) and {R6'), an additional reduction term. 
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Simplification rule 3 is applied only initially, before the actual proof search starts. 
Rules 5 and 7 are used to keep formula sizes smaller during proof search, where 
rule 5 is a special form of (i?6) that has the purpose to accelerate introduction 
of so far unused variables. 

Moreover, there are two additional simplification rules that, however, may 
change n-provability: (1) partial Skolemization to remove V-quantifiers and (2) 
replacement of free variables by new constants. 



4 Experimental Results 

We made some experiments with our implementation on a Sun Enterprise 450 
Server running at 400 MHz. The Glasgow Haskell Compiler, version 4.04, was 
used to translate our source files. In Table 1 the results of our tests are summa- 
rized. The problem class directly corresponds to the number of variables used 
for the proof, as indicated at the end of Section 2. 



Table 1. aRa run times for some relation algebra problems. 



problem 


source 


class 


strat. 


proofs steps 


time 


3.2(v) 


[TG87] 


SSA 


LI 


2 


46 


130 


3.2(vi) 


[TG87] 


SSA 


LI 


2 


41 


100 


3.2(xvii) 


[TG87] 


SSA 


LI 


3 


22 


40 


3.1(iii)(e) 


[TG87] 


RA 


AI 


6 


104 


200 


3.2(xix) 


[TG87] 


RA 


LA 


3 


25 


50 


Thm 2.7 


[CT51] 


RA 


AI 


1 


12 


40 


Thm 2.11 


[CT51] 


RA 


AI 


1 


19 


50 


Cor 2.19 


[CT51] 


RA 


AI 


1 


77 75170 


Dedekind 


[DG98] 


RA 


AI 


1 


37 


90 


Cor 2.19 


[CT51] 


RRA AI 


1 


38 


140 



In the last three columns the following information is given: the number of 
proofs that the problem consists of, the total number of RPC-reductions®, and 
the total proof time in milliseconds. The first letter in the strategy column is 
“L” for the normal LP reduction strategy and “A” for the LP strategy with 
priority for the additional V-simplification rule. The second letter corresponds to 
the selection of disjunctive subformulae, i.e., A in rule (R4) and B in rule (i?6). 
The strategy indicated by letter “I” selects a minimal suitable disjunction, “A” 
a maximal one. 

The proof of Corollary 2.19 from [CT51] reveals an unexpectedly long run- 
time, which is reduced considerably by allowing more variables for the proof and 
thus switching to RRA. 



® Only reductions with one of the rules (R4), (R5), (R6) and {R6') are considered. 
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5 Conclusion and Future Work 

By using aRa, many small and medium-sized theorems in various relation al- 
gebras could be proved. Formulae not containing the relative unit predicate can 
be handled quite efficiently, other formulae suffer from the fact that neither the 
RPC„ calculi nor the aRa prover offer a special treatment of equality. 

Compared with other implementations [HBS94, vOG97, DG98], the most 
obvious differences are the automatic proof procedure using reduction strategies 
and the translation to the RPC„ calculi. The RALF system [HBS94] offers no 
automatic proof search, but has particular strengths in proof presentation. RALL 
[vOG97] is based on HOL and Isabelle, and thus is able to deal with higher order 
constructs. It also offers an experimental automatic mode using Isabelle’s tactics. 
(5RA is a Display Logic calculus for relation algebra, and its implementation 
[DG98] is based on Isabelle’s metalogic. It also offers an automatic mode using 
Isabelle’s tactics. 

aRa can also be used to generate proofs in ordinary first-order logic and in 
restricted variable logics. As aRa is the initial implementation of a new calculus, 
we expect that further progress is very well possible. Implementation of new 
strategies or built-in equality may be viable directions for improvement. 



Availability. The aRa system is available as source and binary distribution 
from www-sr . informatik.uni-tuebingen.de/~sinz/ARA. 



References 

[CT51] L. H. Chin and A. Tarski. Distributive and modular laws in the arithmetic 
of relation algebras. University of California Publications in Mathematics, 
New Series, l(9):341-384, 1951. 

[DG98] J. Dawson and R. Gore. A mechanized proof system for relation algebra using 
display logic. In JELIA ’98, LNAI 1489, pages 264-278. Springer, 1998. 

[Gor95] L. Gordeev. Gut free formalization of logic with finitely many variables, part 
I. In CSL’94, LNCS 933, pages 136-150. Springer, 1995. 

[Gor99] L. Gordeev. Variable compactness in 1-order logic. Logic Journal of the 
IGPL, 7(3):327-357, 1999. 

[HBS94] C. Hattensperger, R. Berghammer, and G. Schmidt. RALF - a relation- 
algebraic formula manipulation system and proof checker. In AMAST’93, 
Workshops in Computing, pages 405-406. Springer, 1994. 

[Mad83] R. Maddux. A sequent calculus for relation algebras. Annals of Pure and 
Applied Logic, 25:73-101, 1983. 

[TG87] A. Tarski and S. Givant. A Formalization of Set Theory without Variables, 
volume 41 of Colloguium Publications. American Mathematical Society, 
1987. 

[vOG97] D. von Oheimb and T. Gritzner. RALL: Machine-supported proofs for rela- 
tion algebra. In Automated Deduction - CADE-lf, LNAI 1249, pages 380- 
394. Springer, 1997. 




Scalable Knowledge Representation and 
Reasoning Systems 



Henry Kautz 

AT&T Labs-Research 
180 Park Ave 

Florham Park NJ 07974, USA 
kautzSresearch . att . com 



Abstract. Traditional work in knowledge representation (KR) aimed 
to create practical reasoning systems by designing new representations 
languages and specialized inference algorithms. In recent years, how- 
ever, an alternative approach based on compiling combinatorial reason- 
ing problems into a common propositional form, and then applying gen- 
eral, highly- efficient search engines has shown dramatic progress. Some 
domains can be compiled to a tractable form, so that run-time problem- 
solving can be performed in worst-case polynomial time. But there are 
limits to tractable compilation techniques, so in other domains one must 
compile instead to a minimal combinatorial ’’core” . The talk will describe 
how both problem specifications and control knowledge can be compiled 
together and then solved by new randomized search and inference algo- 
rithms. 
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Abstract. An efficient method for minimal model generation is pre- 
sented. The method employs branching assumptions and lemmas so as 
to prune branches that lead to nonminimal models, and to reduce min- 
imality tests on obtained models. This method is applicable to other 
approaches such as Bry’s complement splitting and constrained seareh or 
Niemela’s groundedness test, and greatly improves their efficiency. We im- 
plemented MM-MGTP based on the method. Experimental results with 
MM-MGTP show a remarkable speedup compared to MM-SATCHMO. 



1 Introduction 

The notion of minimal models is important in a wide range of areas such as 
logic programming, deductive databases, software verification, and hypotheti- 
cal reasoning. Some applications in such areas would actually need to generate 
Herbrand minimal models of a given set of first-order clauses. 

Although the conventional tableaux and the Davis-Putnam methods can con- 
struct all minimal models, they may also generate nonminimal models that are 
redundant and thus would cause inefficiency. In general, in order to ensure that 
a model M is minimal, it is necessary to check if M is not subsumed by any 
other model. We call it a minimality test on M. Since minimality tests on ob- 
tained models become still more expensive as the number of models increases, 
it is important to avoid the generation of nonminimal models. 

Recently two typical approaches in the tableaux framework have been re- 
ported. Bry and Yahya [1] presented a sound and complete procedure for gen- 
erating minimal models and implemented MM-SATCHMO [2] in Prolog. The 
procedure rejects nonminimal models by means of complement splitting and 
constrained search. Niemela also presented a propositional tableaux calculus for 
minimal model reasoning [8] , where he introduced the groundedness test which 
substitutes for constrained searches. However, both approaches have the follow- 
ing problems: they perform unnecessary minimality tests on such models that are 
assured to be minimal through a simple analysis of a proof tree, and they cannot 
completely prune all redundant branches that lead to nonminimal models. 

To solve these problems, we propose a new method that employs branch- 
ing lemmas. It is applicable to the above approaches to enhance their ability. 
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Branching lemmas provide an efficient way of applying factorization [6] to min- 
imal model generation, and the use of them is justified by the notion of proof 
commitment. Consider model generation with complement splitting. If no proof 
commitment occurs between a newly generated model M and any other model 
M' that has been obtained (no branch extended below a node labeled with a 
literal in M' is closed by the negation of a literal in M), M is guaranteed to 
be minimal, thus a minimality test on M can be omitted. In addition, by prun- 
ing many branches that result in nonminimal models, the search space will be 
greatly reduced. The above things can be achieved with branching lemmas. 

We implemented the method on a Java version of MGTP [3,4] into which 
the functions of CMGTP [9] are already incorporated. We call this system MM- 
MGTP. It is applicable to first-order clauses as well as MM-SATCHMO. Exper- 
imental results show remarkable speedup compared to MM-SATCHMO. 

This paper is organized as follows: in Section 2 the basic procedure of MGTP 
is outlined, while in Section 3 key techniques for minimal model generation 
are described. Then in Section 4 we define the branching lemma and explain 
how it works for minimal model generation. Section 5 refers to the features of 
MM-MGTP, and Section 6 proves the soundness and completeness of minimal 
model generation with complement splitting and branching lemmas, in Section 
7 we compare experimental results obtained by running MM-MGTP and MM- 
SATCHMO, then discuss related work in Section 8. 

2 Outline of MGTP 

Throughout this paper, a clause is represented in implication form: Ai A . . . A 
Am ^ Hi V ... V Bn where Ai{l < i < m) and Bj{l < j < n) are literals; the 
left hand side of — > is said to be the antecedent, and the right hand side of ^ the 
consequent. A clause is said to be positive if its antecedent is true {m = 0), and 
negative if its consequent is false (n = 0). A clause for n < 1 is called a Horn 
clause, otherwise a clause for n > 1 is called a non-Horn clause. A clause is said 
to be range-restricted if every variable in the consequent of the clause appears 
in the antecedent, and violated under a set M of ground literals if it holds that 
Vz(l <i< m)Aia € M A Vj(l < j < n)Bja ^ M with some substitution a. 

A sequential algorithm of the MGTP procedure mg is sketched in Fig. 1. 
Given a set S of clauses, mg tries to construct a model by extending the current 
model candidate M so as to satisfy violated clauses under M {model extension). 
This process forms a proof tree called an MG- tree. In Fig. I, operations to 
construct a model tree T consisting of only models are added to the original 
procedure, for use in a model checking type MM-MGTP to be explained later. 
The function mgQ takes, as an initial input, the consequents of positive Horn 
and non-Horn clauses, an empty model candidate M and a null model tree T, 
and returns SAT/UNSAT as a proof result. It works as follows: 

(1) As long as the unit buffer U is not empty, mgQ picks up a unit literal u 
from U, and extends a model candidate M with u {Horn extension). T' 0 u 
means that u is attached to the leaf of T' . Then, the conjunctive matching 
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Uo <— positive Horn clauses; Do ^ positive non-Horn clauses; 

T ^ cj)-, Ans ^ mgO{Uo,Do, 0, T); 

function mgO(U, D, M,var T) { T' ^ (f>\ 

while (H / 0) { f/ ^ f/\ {m G H}; • • • (1) 

if (u ^ M) { M ^ M U {u}; T' © u; CJM{u, M); 

if { M is rejected ) return UNSAT; } 
if (U = 0) { Simp&cSubsump{D , M)\ ■ • • (2) 

if ( M is rejected ) return UNSAT; } } 
if (D / 0) { d ^ (Li V . . . V L„) G D; D ^ D \ {d}; • • • (3) 

A ^ UNSAT; 

for j ^ 1 to n { Tj ^ (p; A ^ A o mgO(U U {Lj}, D, M, Tj)\ } 
T©T'©(Ti,...,T„); return A; } 
else { T © T'; return SAT; } ■ ■ ■ (4) 



Fig. 1. The MGTP procedure mg 



L £ M -^L £ M 



L(-^L)£M ~^L(L)\JC£D 



T 



C 



Fig. 2. Unit refutation 



Fig. 3. Disjunction simplification 



procedure CJM{u,M) is invoked to search for clauses whose antecedents 
are satisfied by M and u. If such nonnegative clauses are found, their con- 
sequents are added to U or the disjunction buffer D according to the form 
of a consequent. When the antecedent of a negative clause is satisfied by 
M U {u} in CJM{u, M), or the unit refutation rule shown in Fig. 2 applies 
to M U {u}, mgO rejects M and returns UNSAT {model rejection). 

(2) When U becomes empty, the procedure SimpSzSubsump{D , M) is invoked 
to apply the disjunction simplification rule shown in Fig. 3 and to perform 
subsumption tests on D against M. If a singleton disjunction is derived as 
a consequence of disjunction simplification, it is moved from D to U. When 
an empty clause is derived, mgO rejects M and returns UNSAT. 

(3) If D is not empty, mgO picks up a disjunction d from D and recursively 
calls mgQ to expand M with each disjunct Lj £ d {non-Horn extension). 
A o B returns SAT if either A or i? is SAT, otherwise returns UNSAT. 
T' © (Ti . . .T„) means that each sub model tree Tj of Lj is attached to the 
leaf of r', where ((/), 4 >) = 4 > and T (B 4 > = T. 

(4) When both U and D become empty, mgO returns SAT. 

The nodes of an MG-tree except the root node are all labeled with literals 
used for model extension. A branch or a path from the root to a leaf corresponds 
to a model candidate. Failed branches are those closed by model rejection, and 
are marked with x at their leaves. A branch is a success branch if it ends with a 
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SI = 



( 



^ a y by c. 
a ^ b. 

c . 



a b c 

I © X 



b 



© 



(^iiA...A^ifci) V...V {Bni A . . . A BnkJ 



Bii 




Bnl 


Biki 




Bjikn 



Fig. 4. SI and its MG-tree 



Fig. 5. Splitting rule 



node at which model extension cannot be performed any more. Figure 4 gives an 
MG-tree for the clause set SI. Here two models {a, b} and {6} are obtained, while 
a model candidate {c} is rejected, {a, 5} is nonminimal since it is subsumed by 
{6}. We say that a model M subsumes M' if M C M' . The mark ©(g) placed at 
a leaf on a success branch indicates that the model corresponding to the branch 
is minimal (respectively nonminimal). 

MGTP allows an extended clause of the form Ante — > {Bn A ... A Bik^) V 
. . . V {Bni A ... A Bnk„) as in [5]. The clause implies that model extension with 
it is performed according to the splitting rule shown in Fig. 5 

Major operations in MGTP, such as conjunctive matching, subsumption test- 
ing, unit refutation, and disjunction simplification, comprise a membership test 
to check if a literal L belongs to a model candidate M. So speeding up the test 
is the key to achieving a good performance. For this, we introduced a facility 
called an Activation- cell (A-cell) [4]. It retains a boolean flag to indicate whether 
a literal L is in the current model candidate M under construction (active) or 
not (inactive) . On the other hand, all occurrences of L are uniquely represented 
as a single object in the system (no copies are made), and the object has an ac 
held to refer to an A-cell. So, whether L G M or not is determined by merely 
checking the ac held of L. By using the A-cell facility, every major operation in 
MGTP can be performed in 0(1) w.r.t. the size of the model candidate. 

3 Minimal Model Generation 

The first clause ^ a V 5V c in S'! is equivalent to an extended clause — > (a A ^6 A 
^c) V (5 A ^c) V c. By applying the splitting rule in Fig. 5 to the extended clause, 
the nonminimal model {a, b} of SI can be pruned since the unit refutation rule 
applies to b and ~^b. The added ~^b and are called branching assumptions and 
they are denoted by [^6] and [^c], respectively. In general, non-Horn extension 
with a disjunction Ti V T2 V . . . V L„ is actually performed using an augmented 
one {Li A [^^2] A . . . A [^Tn]) V {L2 A A . . . A [^Tn]) V . . . VL„, which exactly 
corresponds to an application of the complement splitting rule [1]. 

Gomplement splitting guarantees that the leftmost model in an MG-tree is 
always minimal, as proven by Bry and Yahya [1]. However, all other models 
generated to the right of it are not necessarily minimal. For instance, given the 
clause set S2 in Fig. 6, we obtain a minimal model {a} on the leftmost branch, 
while obtaining a nonminimal model {5, a} on the rightmost branch. 
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S2 = 



J ^ a V 6. 
^ 6 ^ a. 






a 



hb] 

© 
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a 



© 



Fig. 6. Ineffective branch- 
ing assumption 
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Fig. 7. An MG-tree with branching lemmas 



In order to ensure that every model obtained is minimal, MM-SATCHMO 
employs constrained search based on model constraints [1] as follows. When a 
minimal model {L\, . . . , Am} is found, a model constraint, i.e., a new negative 
clause Ai A . . . A Lm is added to the given clause set. For instance, in Fig. 6, 
a negative clause a ^ is added to S2 when the minimal model {a} is obtained. 
The negative clause forces the nonminimal model {5, a} to be rejected. 

However, this method needs to maintain negative clauses being added dy- 
namically, the number of which might increase significantly. Moreover, it may 
bring rather heavy overhead due to conjunctive matching on the negative clauses, 
which is performed every time a model candidate is extended. 

To alleviate the above memory consumption problem, Niemela’s approach 
[8] seems to be promising. His method works as follows. Whenever a model 
M = {Ai, . . . , Lm} is obtained, it is tested whether \/L & M S'UM^A holds 
or not, where S is the given clause set and M = {^A' | A' ^ M}. This test, 
called the groundedness test, is nothing but reconstructing a new tableaux with 
a temporarily augmented clause set Sm = 5'UMU{AiA...A Lm — *■}• If Sm is 
unsatisfiable, then it is concluded that M is minimal, otherwise nonminimal. 

4 Branching Lemma 

If branching assumptions are added symmetrically, inference with them becomes 
unsound. For instance, consider the clause set S2' obtained by adding a clause 
a —> 6 to 52 in Fig. 6. If is added to the disjunct b, no models are obtained 
for 52', although a minimal model {a,b} does exist. However, for 52, ^ a can be 
added to b to reject the model {b, a}, because the proof below a does not depend 
on that of b, that is, there is no mutual proof commitment between the two 
branches. In this situation, we can use as a unit lemma in the proof below b. 

Definition 1. Let Li be a disjunct m Ai V . . . V A„ used for non-Horn extension. 
The disjunct Li is called a committing disjunct, if a branch expanded below Li is 
closed by the branching assumption [~^Lj] of some right sibling disjunct Lj (z-l-1 < 
j < n). On the other hand, every right sibling disjunct Lk {i 1 < k < n) is 
called a committed disjunct from Li. 
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Fig. 8. Pruning by branching lemmas Fig. 9. Omitting minimality test 



Definition 2. If L i in LiV . . .y Ln used for non-Horn extension is not a com- 
mitting disjunct, we add ^Li to every right sibling disjunct Lj (z + 1 < j < n). 
Such ^Li is called a branching lemma and is denoted by h^i] . 

For example, in Fig. 7, is a committing disjunct since the assumption 
[^6^] is used to close the leftmost branch expanded below a^. Here, a super- 
script is added to a literal to identify an occurrence of the identical literal, e^, b^ 
are committed disjuncts since they are committed from a^. Branching lemmas 
hc^l, he^l, and hc^| are generated from non-committing disjuncts c^, e^, and 
c^, respectively, whereas ha^| cannot be generated from the committing dis- 
junct a^. 

Definition 3. Let M be a model obtained in an MG-tree. If it contains a com- 
mitted disjunct Lj in Li V . . . VL„ used for non-Horn extension, each committing 
disjunct Li appearing as a left sibling of Lj is said to be a committing disjunct 
relevant to M. M is said to be a safe model if it contains no committed disjuncts. 
Otherwise, M is said to be a warned model. 

With branching lemmas, it is possible to prune branches that would lead to 
nonminimal models as shown in Fig. 8. In addition to this, branching lemmas 
have a great effect of reducing minimality tests as described below. 



Omitting a Minimality Test. If an obtained model M is safe, M is assured 
to be minimal so that no minimality test is required. Intuitively, this is justified 
as follows. If Lj e M is a disjunct in some disjunction Li V . . . V it cannot 
be a committed disjunct by Definition 3. For each left sibling disjunct Lfc(l < 
fc < j — 1), a model M' containing Lk, if any, satisfies the following: Lj ^ M' 
under branching assumption [~^Lj\, and Lk ^ M under branching lemma hLfc|. 
Thus, M is not subsumed by M' . 

For instance, in Fig. 9, all obtained models of 55 are assured to be minimal 
without performing any minimality test, since they are safe. 



Restricting the Range of a Minimality Test. On the other hand, if an 
obtained model M is warned, it is necessary to perform a minimality test on M 
against models that have been obtained. However, minimality tests should be 
performed only against such models that contain committing disjuncts relevant 
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to M . We call this a restricted minimality test. The reason why a minimality test 
is necessary in this case is as follows: Suppose that Lk is a committing disjunct 
relevant to M, Lj is the corresponding committed disjunct in M and M' is a 
model containing L^. Although Lj ^ M' under branching assumption [~^Lj\, it 
may hold that Lk & M since the branching lemma |^Tfc] is not allowed for M. 
Thus, M may be subsumed by M' . 

For example, in Fig. 7, model Mi is safe because it contains no committed 
disjunct. Thus, a minimality test on M\ can be omitted. Models M 2 , M^, M 4 
are warned because they contain the committed disjunct or b^. Hence, they 
require minimality tests. Since is the committing disjunct relevant to each of 
them, minimality tests on them are performed^ only against Mi containing . 

5 Implementation of MM-MGTP 

We have implemented two types of a minimal model generation prover MM- 
MGTP: model checking and model re- computing. The former is based on Bry 
and Yahya’s method and the latter on Niemela’s method. 



Model Checking MM-MGTP. Although a model checking MM-MGTP is 
similar to MM-SATGHMO, the way of treating model constraints differs some- 
what. Instead of dynamically adding model constraints (negative clauses) to the 
given clause set, MM-MGTP retains them in the form of a model tree T. Thus, 
the constrained search for minimal models in MM-SATGHMO is replaced by a 
model tree traversal for minimality testing. For this, whenever a warned model 
M is obtained at (4) in Fig. 1, mg invokes the attached procedure mchk. 

Gonsider Fig. 7 again. When the proof of has completed, the node Nai 
labeled with in T is marked as having a committing disjunct, and a pointer to 
the A-cell allocated for its parent (root) is assigned to a com field of the corre- 
sponding committed literals e^,b^. By this, e^, b^ are recognized to be committed 
just by checking their com fields, and branches below e^, b^ are identified to be 
warned. Hence, when a model M 3 is generated, mchk first finds the committed 
disjunct b^ in M 3 . Then, finding 6 ^’s left sibling node Ngi, mchk traverses down 
paths below A(ji searching for a minimal model that subsumes M 3 . 

During the traversal of a path in T, each node on the path is examined 
whether a literal L labeling the node belongs to the current model M (active) or 
not by checking the ac field of L. If L is active, it means that L G M, otherwise 
L ^ M. For the latter, mchk quits traversing the path immediately and searches 
for another one. If mchk reaches an active leaf, it means that M is subsumed by 
the minimal model on the traversed path, and thus M is nonminimal. 

Here, we employ early pruning as follows. If the current model M = {Li , . . . , 
Li, .. ., Lm} is subsumed by the previous model M' such that M' C {Li, . . . ,Li}, 
we can prune the branches below Li. Although mchk is invoked after M has been 

^ For further refinement, a minimality test on M that contains no committing disjunct 
relevant to it can be omitted. This is the case for M 2 . 
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generated, our method is more efficient than using model constraints, since it 
performs a minimality test not every time model extension with Lj G M occurs, 
but only once when a warned model M is obtained. 



Model Re-computing MM-MGTP. A model re-computing MM-MGTP can 
also be implemented easily. In this version, model tree operations are removed 
from mg, and the re-computation procedure rcmp for minimality testing is at- 
tached at (4) in Fig. 1. rcmp is the same as mg except that some routines are 
modified for restarting the execution. It basically performs groundedness tests 
in the same way as Niemela’s: whenever a warned model M = {Li , . . . , Lm} is 
obtained, mg invokes rcmp to restart model generation for SUM with a neg- 
ative clause Cm = Li A ... A Lm — > being added temporarily. If rcmp returns 
UNSAT, then M is assured to be minimal. Otherwise, a model M' satisfying 
M' C M should be found, and then M will be rejected since it turns out to 
be nonminimal. Note that no model M" satisfying M” M will be generated 
by rcmp because of the constraints M and Cm- Note also that those minimal 
models which subsume M, if any, must be found to the left of M in the MG-tree, 
due to complement splitting. 

The slight difference with Niemela’s is that in place of Cm above, we use a 
shortened negative clause Lk^ A ... A consisting of committed disjuncts 

in M. It is obtained by removing from Cm uncommitted literals L„ G M, i.e., 
those not committed from their left siblings. For instance in Fig. 7, when model 
Mr = is obtained, a shortened negative clause will be created 

instead of 6 ^ A A d? since only is the committed disjunct in M 4 . 

The use of shortened negative clauses corresponds to the restricted minimal- 
ity test and enables it to avoid groundedness tests on such uncommitted literals 
L„. The validity of using shortened negative clauses is given in Theorem 4. 

6 Soundness and Completeness 

In this section, we present some results on soundness and completeness of the 
MM-MGTP procedure. First, we show that model generation with factorization 
is complete for generating minimal models. This implies that the MM-MGTP 
procedure is also complete because the use of branching assumptions and lemmas 
can be viewed as an application of factorization. Second, we give a necessary 
condition for a generated model to be nonminimal. The restricted minimality test 
keeps minimal model soundness because it is performed whenever the condition is 
satisfied. Last, we prove that using shortened negative clauses for a groundedness 
test guarantees the minimality of generated models. 

Theorem 1. Let T he a proof tree of a set S of clauses, N\ and N 2 be sibling 
nodes in T, Li a literal labeling Ni and Ti a subproof tree below Ni{i = 1,2), 
as shown in Fig. 10(a). If N 2 has a descendant node N 3 labeled with L\, then 
for each model M through a subproof tree T 3 below A 3 , there exists a model M' 
through Ti such that M' C M {Fig. 10(b)). 
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N2 ■ L2 



:Li 





(a) (b) (c) 

Fig. 10. Proof trees explaining Theorem 1 , 2 



A^i : L\ • • • N2 : L2 




Proof. We define the sequence of literals si, S2, . . constituting M' through Ti 
by induction. Let / be a set of literals on the path P ending with Ni. If Ni is a 
leaf (Ti = 4 >), P must be a success path and M' = / C M, because M is a model 
through Fa and L\ e M. Otherwise, there is a clause Ci = A — > L} V . . . V 
violated under / and used for model extension at N\ (Fig. 10 (c)). Then, M \= Pi 
and M \= L\w . . .W since M is a model. So, there exists a node labeled with 
Si such that si € {L\,...,L\^} and si G M. Suppose that we have defined 
the first n literals of M' in A to be si, . . . , s„, by traveling down the successor 
nodes whose labels belong to M. If the node labeled with s„ is a leaf, we are 
done. Otherwise, we may continue our definition of M' . The sequence ends with 
a label of a leaf finitely or continues forever. In either case, there exists a model 
M' containing literals in the sequence such that M' = / U {si, S2, . . .} C M. □ 

The above is a fundamental theorem for proving the minimal model com- 
pleteness of model generation with factorization. We define our factorization 
essentially in the same means as tableau factorization [6]. To avoid a circular 
argument, a factorization dependency relation is arranged on a proof tree. 

Definition 4 (Factorization Dependency Relation). A factorization de- 
pendency relation on a proof tree is a strict partial ordering -< relating sibling 
nodes in the proof tree. A relation Ni -< N2 means that searching for minimal 
models below N2 is committed to that below Ni. 



Definition 5 (Factorization). Given a proof tree T and a factorization depen- 
dency relation -< on T, first select a node N3 labeled with literal Li and another 
node Ni labeled with the same literal Li such that (1) N3 is a descendant of 
a node N2 which is a sibling of N\, and ( 2 ) N2 7^ N\. Next, close the branch 
extended to N3 (denoted by *) and modify -< by adding a relation Ni < N2, then 
forming the transitive closure of the relation. The symbol ★ means that the proof 
of N3 is committed to that of Ni. The situation is depicted in Fig. 10 (d). 



Corollary 1. Let S be a set of clauses. If a minimal model M of S is built by 
model generation, then M is also built by model generation with factorization. 



Proof. Immediately from Theorem 1 . 



□ 
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The model generation procedure is minimal model complete (in the sense that 
it generates all minimal models) for range-restricted clauses [1]. This implies the 
minimal model completeness of model generation with factorization. 

Corollary 2. (Minimal model completeness of model generation with factoriza- 
tion) Let S he a satisfiahle set of range-restricted clauses and T a proof tree by 
model generation with factorization. If M is a minimal model of S, then M is 
found in T. 

We consider model generation with branching assumptions and lemmas as 
arranging factorization dependency relation on sibling nodes Ni, , Nm labeled 
with Ti, . . . , Lm, respectively, as follows: Nj Ni for all j{i < j < m) if Li is a 
committing disjunct, while Ni -< Nj if |^Ti] is used below Nj. This consideration 
leads to the minimal model completeness of the MM-MGTP procedure. 

Corollary 3 (Minimal Model Completeness of MM-MGTP). Let S be a 

satisfiahle set of range-restricted clauses and T a proof tree by model generation 
with branching assumptions and branching lemmas. If M is a minimal model of 
S, then M is found in T. 

Although model generation with factorization can suppress the generation of 
nonminimal models, it may still generate them. In order to make the procedure 
sound, that is, to make it generate minimal models only, we need a minimality 
test on an obtained model. The following theorem gives a necessary condition 
for a generated model to be nonminimal. 

Theorem 2. Let S he a set of clauses and T a proof tree of S by model genera- 
tion with factorization. Let Ni and N2 be sibling nodes in T, Ti a subproof tree 
below Ni and Mi a model through Tfii = 1,2). If N2 -fi N\, then M\ % M2. 

Proof. Suppose that Ni is labeled with a literal Lfii = 1,2) (Fig. 10(a)). It 
follows from N2 Ni that (1) Ni -< N2 or (2) there is no ^-relation between 
Ni and N2. If (1) holds, Li ^ M2 because every node labeled with Li in T2 has 
been factorized with Ni. On the other hand, Li G Mi. Therefore, Mi % M2. If 
(2) holds, Li ^ M2 because there is no node labeled with Li in T2. Therefore, 
Ml % M2. □ 

Theorem 2 says that (1) if fV2 -^1, no minimality test on M2 against Mi 
is required, otherwise (2) if N2 -< fVi, we need to check the minimality of M2 
against Mi. 

In MM-MGTP based on a depth-first-left-first search, omitting a minimality 
test on a safe model is justified by the above (1), while the restricted minimality 
test on a warned model is justified by (2). If Ni were a left sibling of N2 such 
that Ni -< N2, e.g., a branching lemma |^Ti] is used below N2, a minimality 
test on Ml against M2 will be required according to Theorem 2. However it is 
unnecessary in MM-MGTP since it always holds that M2 % Mi as follows. 

Theorem 3. Let T he a proof-tree by a depth-first-left-first search version of 
model generation with factorization and Mi a model found in T. If a model M2 
is found to the right of Mi in T, then M2 % Mi . 
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Proof. Let be a path corresponding to Mi {i = 1,2). Then, there are sibling 
nodes N\ on Pmi and N2 on Pm2- Let Li be a label of Ni {i = 1,2). Now assume 
that M2 C Ml. This implies L2 € M\. Then, there is a node labeled with 
L 2 on Pmi • However, can be factorized with N 2 in the depth-first-left-first 
search. This contradicts that M\ is found in T. Therefore, M2 % M\. □ 



Corollary 4 (Minimal Model Soundness of MM-MGTP). Let S be a 

satisfiable set of range-restricted clauses and T a proof tree by model generation 
with branching assumptions, branching lemmas, and restricted minimality tests. 
If M is a model found in T, then M is a minimal model of S. 

The following theorem says that a shortened negative clause for the ground- 
edness test guarantees the minimality of generated models. 

Definition 6. Let S be a set of clauses, T a proof tree of S by model generation 
with factorization and M a model found in T. For each literal L G M , Nr denotes 
a node labeled with L on the path of M . Let Mp C M be a set satisfying the 
following condition: for every L G Mp, there exists a node N such that Nr -< N . 
Note that Mp is a set of committed disjuncts in M . Cmp denotes a shortened 
negative clause of the form Li A . . . A L^ where Li G Mp{i = 1, . . . , m). 



Theorem 4. Let S be a set of clauses. M is a minimal model of S if and only 
if Smp = S yj M \J {Cmp^} is unsatisfiable, where M = {L' — > | L' ^ M}. 

Proof. (Only-if part) Let M' be a model of S. There are three cases according 
to the relationship between M and M'\ (1) M' \ M yf 0, (2) M' = M, or (3) 
M' C M. If (1) holds, M' is rejected by a negative clause in M. If (2) holds, 
M' is rejected by the shortened negative clause Cmp- The case (3) conflicts with 
the assumption that M is minimal. Now that no model of S' is a model of Snip- 
Therefore, Smp is unsatisfiable. □ 

Proof. (If part) Let T be a proof tree of S by model generation with factoriza- 
tion. Suppose that M is not minimal. Then there exists a model M' of S found 
in T such that M' C M. Let Pm,P'm be the paths corresponding to M,M', 
respectively. Then, there are sibling nodes N and N' in T such that N is on Pm 
and N' on Let L,L' be a label of N, N' , respectively. In case of IV A N' , 
M' conflicts neither with Cmp because L ^ M' nor with M because M' C M . 
Thus, M' is a model of Smp- This contradicts that Smp is unsatisfiable. In case 
of TV 7 ^ IV', since a node labeled with L' cannot appear on Pm, both L' ^ M 
and L' G M' hold. This contradicts that M' C M. Therefore, M is minimal. □ 



^ If Mp — L\f\. . .Aim ^ becomes an empty clause ^ which denotes contradiction. 

In this case, we conclude that M is minimal without a groundedness test. 
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7 Experimental Results 

This section compares experimental results on MM-MGTP with those on MM- 
SATCHMO and MGTP. Regarding MM-MGTP, we also compare four ver- 
sions: model re-computing with/without branching lemmas (Rcmp-|-BL/Rcmp), 
and model checking with/ without branching lemmas (Mchk-|-BL/Mchk). MM- 
MGTP and MGTP are implemented in Java, while MM-SATGHMO in EGL*PS® 
Prolog. All experiments were performed on a Sun UltralO (333 MHz , 128 MB). 
Table 1 shows the results. The examples used are as follows. 



exl. S'n = {^ Ofc V V Cfc V dfc V Cfc V /fc V gfc V /ifc V Zfc V jfc I 1 < fc < n} 

This problem is taken from the benchmark examples for MM-SATGHMO. 
The MG-tree for exl is a balanced tree of branching factor 10 , and every gen- 
erated model is minimal. Since every success branch contains no committed 
disjunct, i.e., the corresponding model is safe, no minimality test is required if 
branching lemmas are used. 

ex2. Sn = {oi-i ^ ai\J bi\J Ci, h ^ ai, Ci ^ hi \2 < i < n}\J {-^ oi} 

The MG-tree for ex 2 becomes a right-heavy unbalanced tree. Only the left- 
most branch gives a minimal model, which subsumes all other models to the 
right. With branching lemmas, these nonminimal models can be rejected. 

ex 3 . Ti = ai V bi, ai bi, 6i — > 02 V 62, 02 ^ 62 V di} 

T2 = {62 ^ 03 V 63, 03 ^ 02 V C2, 03 A 02 ^ 63 V C?2, O3 A C2 ^ 63 V ^2} 

J ‘j — ■( bj ^ CLj -j_ ]_ V bj -|- 1 , CLj -j_ ]_ ^ CLj V Cj , Cj ^ CLj — ]_ V Cj — , 

Oj+1 A 02 — > bj+i V dj, Oj+i A C2 ^ bj+i V dj} {j > 3 ) 

= ur=i T. 

The MG-tree for ex 3 is a right-heavy unbalanced tree as for ex 2 . Since every 
success branch contains committed disjuncts, minimality tests are inevitable. 
However, none of the obtained models is rejected by the minimality test. 

ex 4 . Sa = ttiV bi\/ Ci\/ di\/ 6 i \ 1 < z < 4 } U {03 ^02,04^03,01^ 04} 

ex 5 . Sabcd = -Sa U {63 ^ 62, 64 ^ 63, bi 64} U {03 ^ C2, C4 ^ C3, Ci ^ C4} 

U {<^3 d2, di ds, di — > £^ 4 } 

ex 4 and ex 5 are taken from the paper [8]. No nonminimal models can be 
rejected without using branching lemmas. 



syn 9 -l. An example taken from the TPTP library [ 11 ], which is unsatisfiable. 
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Table 1. Performance comparison 



Problem 


Rcmp+BL 


Mchk+BL 


Rcmp 


Mchk 


MM-SAT 


MGTP 


exl 


0.271 


0.520 


2.315 


0.957 


8869.950 


0.199 


(N=5) 


100000 


100000 


100000 


100000 


100000 


100000 




0 


0 


0 


0 


0 


0 


exl 


34.150 


OM (>144) 


324.178 


OM (>115) 


OM (>40523) 


19.817 


(N=7) 


10000000 


— 


10000000 


— 


— 


10000000 




0 


- 


0 


- 


- 


0 


ex2 


0.001 


0.001 


82.112 


16.403 


1107.360 


9.013 


(N=14) 


1 


1 


1 


1 


1 


1594323 




26 


26 


1594322 


1594322 


1594323 


0 


ex3 


19.816 


5.076 


19.550 


5.106 


OM (>2798) 


589.651 


(N=16) 


65536 


65536 


65536 


65536 


— 


86093442 




1 


1 


1 


1 


- 


0 


ex3 


98.200 


26.483 


95.436 


26.103 


OM (>1629) 


5596.270 


(N=18) 


262144 


262144 


262144 


262144 


— 


774840978 




1 


1 


1 


1 


- 


0 


ex4 


0.002 


0.002 


0.009 


0.003 


0.3 


0.004 




341 


341 


341 


341 


341 


501 




96 


96 


160 


160 


284 


0 


ex5 


0.001 


0.001 


0.002 


0.001 


0.25 


0.001 




17 


17 


17 


17 


17 


129 




84 


84 


88 


88 


608 


0 


syn9-l 


0.105 


0.109 


0.101 


0.092 


TO (>61200) 


0.088 




0 


0 


0 


0 


— 


0 




19683 


19683 


19683 


19683 


- 


19683 


channel 


4.016 


4.064 


46.166 


4.517 


NA 


3.702 




51922 


51922 


51922 


51922 


— 


51922 




78 


78 


78 


78 


- 


78 



top: time (sec), middle: No. of models, bottom: No. of failed branches. 
MM-SAT: MM-SATCHMO, OM: Out of memory, TO: Time out. 



NA: Not available due to lack of constraint handling 



Channel. A channel-routing problem [12] in which constraint propagation with 
negative literals plays an essential role to prune the search space. One can obtain 
only minimal models with MGTP. The last two first-order examples are used to 
estimate the overhead of minimality testing in MM-MGTP. 

MM-MGTP vs. MM-SATCHMO. Since MM-SATCHMO aborted execu- 
tion very often due to memory overflow, we consider the problems that MM- 
SATCHMO could solve. A great advantage of MM-MGTP is seen for exl that 
does not need any minimality test and ex2 in which branching lemmas have high 
pruning effects. The fastest version of MM-MGTP achieves a speedup of 33,000 
and 1,100,000 for exl and ex2, respectively, compared to MM-SATCHMO. Even 
for small problems like ex4 and ex5, MM-MGTP is more than one hundred 
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times faster than MM-SATCHMO. In addition, it is reported that the Niemela’s 
system takes less than 2 and 0.5 seconds for ex4 and ex5, respectively [8]. 

MM-MGTP vs. MGTP. Compared to MGTP, proving time for MM-MGTP 
is much shortened as the number of nonminimal models rejected by branching 
lemmas and minimality tests increases. In particular, ex2 and ex3 exhibit a great 
effect of minimal model generation with branching lemmas. Although exl, syn9- 
1, and channel are problems such that no nonminimal model is created, very little 
overhead is observed for MM-MGTP that employs branching lemmas, because 
minimality tests can be omitted. 



Rcmp vs. Mchk. Proving time for Rcmp increases from 2 to 5 times that 
for Mchk because of re-computation overhead, for propositional problems exl 
(except N=7) through ex5, that do not require a term memory [10]. For the 
first-order problem channel that requires the term memory, Rcmp is about 10 
times slower than Mchk. This is because the overhead of term memory access is 
large, besides Rcmp doubles the access frequencies when performing grounded- 
ness tests. 

Next, look at the branching lemma effect. For exl and channel, since mini- 
mality tests can be omitted with branching lemmas, Rcmp-|-BL and Mchk-|-BL 
obtain 8.5 to 11.5- and 1.84 to 1.1-fold speedup, respectively, compared to ver- 
sions without branching lemmas. Although the speedup ratio is rather small for 
Mchk-|-BL, it proves that Mchk based on model tree traversal is very efficient. 
ex2 is a typical example to demonstrate the effect, thus both Rcmp-|-BL and 
Mchk-|-BL achieve several-ten-thousand-fold speedup as expected. 



Rcmp+BL vs. Mchk+BL. For ex3 in which minimality tests cannot be omit- 
ted, Mchk-|-BL is about 4 times faster than Rcmp-|-BL. Although for exl (N=5), 
no difference between Mchk-|-BL and Rcmp-|-BL should exist in principle, the 
former is about 2 times slower than the latter. This is because Mchk-|-BL has to 
retain all generated models, thereby causing frequent garbage collection. 

8 Related Work 

In a tableaux framework, Letz presented factorization [6] to prune tableaux 
trees. Gomplement splitting (or folding-down in [6]) is a restricted way of imple- 
menting factorization. It is restricted in the sense that a precedence relation is 
pre-determined between disjuncts in each disjunction, and that only a disjunct 
having higher precedence can commit its proof to that of another sibling disjunct 
with lower precedence, whereas such precedence is not pre-determined in factor- 
ization. Although factorization is more powerful than complement splitting, it 
may also generate nonminimal models without any guide or control. 

Lu [7] proposed a minimal model generation procedure which in a sense re- 
laxes the above restriction by adding branching assumptions symmetrically, as 
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in (oA [^6]) V (5A [^o]) for a disjunction aV5. However, his method involves post- 
determination of the precedence between the disjuncts. This is because mutual 
proof commitment may occur due to symmetrical branching assumptions, and 
some possibly open branch are forced to be closed thereby making the proof un- 
sound (and incomplete w.r.t. model finding). If this is the case, some tentatively 
closed branches have to be re-opened so that the performance would degrade. 

Branching lemmas proposed in this paper can still be taken as a restricted 
implementation of factorization, because it is disabled for a disjunct to gener- 
ate a branching lemma once a branching assumption of some sibling disjunct is 
used to prove the disjunct, whether mutual proof commitment actually occurs or 
not. Nevertheless, our method provides an efficient way of applying factorization 
to minimal model generation, since it is unnecessary to compute the transitive 
closure of the factorization relation. The effects of the branching lemma mecha- 
nism are summarized as follows: it can (1) suppress the generation of nonminimal 
models to a great extent, (2) avoid unnecessary minimality tests, and (3) restrict 
the range of minimality tests on the current model M to models on which com- 
mitting disjuncts relevant to M appear. 

The model checking version of MM-MGTP aims to improve MM-SATCHMO 
by introducing branching lemmas, and it is also based on complement splitting 
and constrained searches. Major differences between both systems are the follow- 
ing. MM-SATCHMO stores model constraints as negative clauses and performs 
minimality tests through conjunctive matching on the negative clauses, thereby 
being very inefficient in terms of space and time. Our model checking version, 
on the other hand, is more efficient because model constraints are retained in a 
model tree in which multiple models can share common paths, and minimality 
tests are suppressed or restricted by using branching lemmas. 

Since the above two systems depend on model constraints which are a kind 
of memoization, they may consume much memory space, the size of which might 
increase exponentially in the worst case. This situation is alleviated by Niemela’s 
method [8]. It can reject every nonminimal model without performing a mini- 
mality test against previously found minimal models, by means of the cut rule 
which is essentially equivalent to complement splitting and the groundedness test 
that is an alternative of the constrained search. 

The model re-computing version of MM-MGTP takes advantage of Niemela’s 
method in which it is unnecessary to retain model constraints. However, both 
systems repeatedly perform groundedness tests rather more expensive than con- 
strained searches. In addition, they necessarily generate each minimal model 
twice. In the model re-computing version, the latter problem is remedied to some 
extent by introducing shortened negative clauses. Moreover, due to branching 
lemmas, it is possible to invoke as few groundedness tests as possible. 

9 Conclusion 

We have presented an efficient method to construct minimal models by means 
of branching assumptions and lemmas. Our work was motivated by the two ap- 
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proaches: Bry’s method based on complement splitting and constrained searches 
and Niemela’s method that employs the groundedness test. However both meth- 
ods may contain redundant computation, which can be suppressed by using 
branching lemmas in MM-MGTP. The experimental results with MM-MGTP 
show that orders of magnitude speedup can be achieved for some problems. 

Nevertheless, we still need minimality tests when branching lemmas are not 
applicable. It should be pursued in future work to omit as many minimality 
tests as possible, for instance, through a static analysis of clauses. It would also 
be worthwhile to combine our method with other pruning techniques such as 
folding-up and full factorization, or to apply it to stable model generation. 
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Abstract. FDPLL is a directly lifted version of the well-known Davis- 
Putnam-Logeman-Loveland (DPLL) procedure. While DPLL is based on 
a sphtting rule for case analysis wrt. ground and complementary hterals, 
FDPLL uses a hfted sphtting rule, i.e. the case analysis is made wrt. 
non-ground and complementary hterals now. 

The motivation for this hfting is to bring together successful Hrst-order 
techniques hke unihcation and subsumption to the propositionally suc- 
cessful DPLL procedure. 

At the heart of the method is a new technique to represent Hrst-order 
interpretations, where a hteral specifies truth values for ah its ground 
instances, unless there is a more specific hteral specifying opposite truth 
values. Based on this idea, the FDPLL calculus is developed and proven 
as strongly complete. 



1 Introduction 

The^ well-known Davis-Putnam procedure, as it is usually called, was brought 
forward in the early 60s by the researchers mentioned in the title [DP60,DLL62], 
[D63]. Nowadays, the procedure is most successfully applied to decide proposi- 
tional problems, although it was originally conceived as a method for first-order 
theorem proving. To this end, successively increased sets of ground instances 
of first-order clauses are enumerated and fed into the propositional part of the 
procedure. This latter part is referred to as “propositional DPLL” in the sequel. 

With the advent of the resolution calculus, the lifting of inference rules to the 
first-order level is standard in virtually all calculi and efficient proof procedures 
for first-order logic — except for Davis-Putnam-Logeman-Loveland methods. 
Thus, the purpose of this paper is to present a lifted version that fills this gap. 

On an abstract level, the advantage of the “lifted” methods compared to 
the “propositional” methods stems from two sources: first, it is possible with a 
lifted method to finitely represent infinitely many inferences of the corresponding 
propositional methods, and, second, much more powerful redundancy elimina- 
tion techniques are possible, e.g. based on subsumption. The motivation is to 
bring these advantages to DPLL. The other way round, FDPLL instantiates to 
propositional DPLL when applied to propositional logic. 

^ For a long version of the paper see 
http : / / WWW . uni-koblenz . de/ f b4/publikat ionen/ gelbereihe/. 

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 200-219, 2000. 

Springer- Verlag Berlin Heidelberg 2000 
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Brief Description of FDPLL. In order to describe the main idea of FDPLL, 
it is helpful to refer to the widely-used presentation of propositional DPLL as 
a calculus with a single splitting rule, which carries out a case analysis wrt. 
a propositional variable A. More exactly, the current clause set S splits into 
two cases: the one where A is “true”, and the other where A is “false”, which 
give rise to simplihcations based on the new information. DPLL proceeds by 
considering dilferent cases until success (some case is a model for 5) or failure 
(each considered case contradicts a clause in 5). 

The idea in FDPLL is to lift this splitting to the hrst-order level, i.e. to split 
with complementary non-ground literals like P{x,y) and ~<P{x, y). The difl&culty 
here is that the “usual” way of reading the literals as universally quantihed (i.e. 
'dx, y P{x, y) and 'dx, y ~<P{x, yf) immediately leads to an unsound calculus. 
Hence, a dilferent reading is adopted: a bit simplihed, a literal, say P{x,y), 
stands by default for all its ground instances, say, P{a,a),P{a,b),P{b,a) and 
P{b,b) (suppose here that only constants a and b are present). However, the 
presence of a strictly more specihc literal (wrt. the instantiation order) than 
P{x,y) with complementary sign, say ~<P{x,b), gives rise to exceptions of the 
default reading of P{x,y) by excluding all instances of P{x,b). Symmetrically, 
-<P{x, b) stands by default for all its ground instances (with the possibility to 
have exceptions again). So, the two literals P{x, y) and ~<P{x, b) together stand 
for P{a, a),-<P{a, b),P{b, a) and ~<P{b, b), which in turn can be understood as an 
interpretation I in the obvious way. 

Now, a “case” in FDPLL is just a set of possibly non-ground literals, such that 
an interpretation can be associated to, as just sketched. Based on this idea, the 
purpose of the splitting rule of FDPLL can be explained as follows: suppose there 
is an instance (7(7 of a clause C that is “false” in the interpretation I associated 
to the current case (cr is computed by most general unihcation). Then, a split is 
attempted with a literal L £ (7(7 in order to “repair” I towards an interpretation 
that assigns “true” to L, and hence to (7(7 as well^. If this is not possible because 
of some elementary contradiction between (7(7 and the current case, the current 
case is refuted (“closed”). Otherwise, two new cases come up, the one extending 
the current case with L, and the other with L. 

Continuing the example above, suppose the current case is {P{x,y), 
-<P{x,b)}, hence I — {P{a,a), ~<P{a,b), P{b,a),~<P{b,b)}, and suppose that 
there is a clause (7 = P{x,y)y -iP{x,a). The clause instance Ccr — P{x,b) V 
-<P{x, a) is “false” in I, where cr — {y/b} is computed by most general unih- 
cation of the literals of (7 and complements of literals of I. Regarding the two 
literals P{x, b) and ~<P{x, a) in (7(7, only ~<P{x, a) is a candidate for splitting, 
because P{x, b) is an elementary contradiction to ~<P{x, b) of the current case. 

The procedure repeatedly carries out splits in this way and stops if every 
case is refuted (and reports “unsatishable” ) , or if no clause instance (7(7 of the 
mentioned kind exists (and reports the current case as a model representation)®. 

^ Actually, this is a bit simplihed, but it serves well to illustrate the idea. 

® There is a second variant of the splitting rule called “Commit” with the purpose to 
achieve that I is indeed consistent. 
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An improvement of this “basic” procedure recovers for certain branch literals 
the above mentioned universally quantihed reading (cf. Section 5). It lifts to 
the hrst-order level the well-known propositional DPLL rule for propagating 
unit clauses. In resolution terminology, the improvement realizes unit-resulting 
resolution, and the well-known possibility to split clauses on the basis of variable- 
disjoint subclauses (as in P{x) V Q{y)). 

Properties of FDPLL. Propositional DPLL has certain desirable features: its 
conceptual simplicity, space efl&ciency (“one branch at a time”), few inference 
rules (one is sufl&cient), efl&cient and adaptable implementations (the most eflh- 
cient systematical propositional methods are based on DPLL, e.g. NTAB [CA96] 
and SATO [Zha97]), existence of non-clausal versions [BBOS98], and, the pos- 
sibility to immediately extract a model in case that no refutation exists. A goal 
of this work is to keep these features for the lifted version FDPLL. 

FDPLL is in particular space efl&cient, proof confluent and convergent (i.e. a 
strong completeness theorem holds) . While these properties go without a saying 
for propositional DPLL, they are an issue for certain flrst-order methods, e.g. 
tableau and connection calculi (but see [BEF99] for a proof confluent strongly 
complete connection calculus). Beyond this, FDPLL is known to be a decision 
procedure for the Bernays-Schbnflnkel class, i.e. clause logic without function 
symbols but constants. This is a non-trivial class, in the sense that most resolu- 
tion and tableau systems cannot decide it, except in a trivial way by using the 
flnite set of ground clauses. 

Strueture of the Paper. After stating some preliminaries, the model represen- 
tation technique is introduced. Based on it, the calculus is developed. Then 
soundness and and completeness is turned to, followed by a sketch of the men- 
tioned “universal literal” improvement. The subsequent proof procedure proves 
the existence of a concrete, fair strategy. Finally, some conclusions are drawn, 
including related work. 

2 Preliminaries 

The usual notions of flrst-order logic are applied in a way consistent to [CL73]. A 
literal is an atom or a negated atom. The letters K and L are reserved to denote 
literals. The eomplement of a literal L is L = A, if L = -lA for some atom A, or 
else L — -iL; by \L\ the atom of L is denoted, i.e. |A| = A and |-iA| = A for any 
atom A. A elause is a flnite, possibly empty multiset {Li , . . . , L„} of literals, 
usually written as a disjunction Li V • • - VLn. By a elause set always a flnite set 
of clauses is meant. The letters C and D are reserved to denote clauses. 

An interpretation I for a given signature 17 is a set of ground 17-literals such 
that either A £ I or -lA £ I for every ground 17-atom A. The signature 17 
is always given implicitly by the input clause set under consideration, and the 
preflx “17—” usually is not written. All the results below hold wrt. such Herbrand 
interpretations; it only has to be assumed that 17 contains at least one constant 
symbol (if none is there, some constant a is added artiflcially) . 
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A ground literal L and a ground clause C is evaluated wrt. an interpretation 
I as expected, i.e. I{L) — true ilf L £ I and I{C) — true ilf I{L) — true 
for some L C. Furthermore, as expected, for a non-ground clause C define 
I{C) — true ilf I{C) — true for every ground instance C' of C . As usual, 
I X means I{X) — true where X is a literal, clause or clause set (interpreted 
conjunctively). 

A unifier for a set Q of terms (or literals) is a substitution J such that QS is 
a singleton. The notion of most general unifier (MGU) is used in the usual sense 
[CL73, e.g. ], and a respective unification algorithm unify is assumed as given. 
The notation cr — unify (Q) means that an MGU a of Q exists and is computed 
by unify applied to Q. 

Quite frequently, a simultaneous unifier for a set {Qi, ■ ■ ■ ,Qn} of unifica- 
tion problems is to be computed, which is a substitution J that is a unifier for 
every Qi,.. -,Qn- The notion of a most general unifier can be defined in the 
standard way in the simultaneous case as well. Further, a simultaneous most 
general unifier (simply called MGU as well) can be computed by iterative ap- 
plication of unify to Qi,. . -,Qn- See [Ede85] for a thorough treatment. Thus, 
we may suppose as given a simultaneous unification algorithm s-unify and write 
(7 = s-unify{{Qi , . . . , Qn}} in analogy to cr = unify{Q) above. 

For literals K and L define K > L, K is more general than L, iif there is a 
substitution ck such that Kck — L; K and L are variants, written as K ^ L, 
iif K > L and L> K; K is strietly more general than L, K > L, iif /F > L and 
not K ~ L. L is also said to be a striet, or proper instance of K then. If neither 
K > L nor L > K then K and L are ineomparable . Finally, define L £~ N iif 
L ~ K for some K ^ N, where is a set of literals. 

3 Basic Concepts Related to Literal Sets 

As mentioned, interpretations shall be represented by literal sets. This section 
contains the respective definitions. In the sequel N always denotes a possibly 
infinite literal set. 

Definition 1 (Most Specific Generalization). A literal K is ealled a most 
specific generalization (MSG) of a literal L wrt. N iff K > L and there is no 
K' £ N sueh that K > K' > L. 

Notice that nothing is said whether K, L £ N or not. 

Example 1. Consider‘d Ni — {P{a,y,u), P{x-,b,u)}. Then both P{a,y,u) and 
P{x, b, u) are MSGs of P{a, b, c) wrt. N. This shows that MSGs need not be 
unique. The literal P{x, y) is not a MSG of P{y, f{x)) wrt. {P{x, f{y))}, because 
P(x,y) > P(x,f(y)) > P(y,f(x)). 

An MSG K £ N of L wrt. N is a “potential reason” for L to be true in the 
interpretation associated to N, because K > L (as said in the introduction). 

d Here and below, the letters P,Q, R, . . . denote predicate symbols, a,b,c,. . . denote 
constants, f,g,h,... denote non-constant function symbols, and x,y,z,... denote 
variables. 
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For efficiency reasons in FDPLL it is desirable to have as few such “reasons” as 
possible. Therefore, most specific generalizations are used. 

Definition 2 (Productivity). A literal K produces L wrt. N iff K is a MSG 
of L wrt. N and there is no K' £ N such that K > K' > L. For a clause C, 
K produces C wrt. N iff K produces some literal L ^ C wrt. N . The set N 
produces L (resp. C) iff some K £ N produces L (resp. C) wrt. N. 

Referring again to the introduction and above, this dehnition realizes the pos- 
sibility to prevent an MSG K of L wrt. N to assign true to L, if there is a 
complementary literal in between (wrt. >) K and L, as stated. An equivalent, 
more compact dehnition of “K produces L wrt. A” is that K > L and there is 
no literal K' £ N such that \K\ > \K'\ > \L\. 

Example 2 . Let N2 — {P{a,y,u), -<P{x-,b,u), P{a,b,u)}. Then, P{a,b,u) pro- 
duces the literal P{a,b, f{u)) wrt. N2. Flowever, P{a,y,u) does not produce 
P{a, b, f{u)) wrt. N2 because P{a, y, u) is not an MSG of P{a, b, /(«)) wrt. A2. 
(since P[a,b,u) < P[a,y,u) is an MSG of P[a,b, f[u)) wrt. N2). The literal 
~'P{x, b, u) produces ~<P{b, b, /(«)) wrt. N2 but does not produce ~<P{a, b, /(«)) 
(since ~'P{x, b, u) > K' > ~'P{a, b, /(«)), where K' — P{a, b, u) £ N2). 

Definition 3 (Ground Expansion). Define the ground expansion of N as 
[A] — {L \ L is a ground literal and A produces L} 

The plan is to identify for a literal set A constructed by FDPLL its ground 
expansion [A] with an interpretation I. Recall from Section 2 that an inter- 
pretation is a set of ground literals such that either A £ I or -lA £ I for 
every ground atom A. However, there is in general no reason for [A] to be an 
interpretation. For instance: 

Example 3 . The set A3 = {P{a, y, u), ~<P{x, b, w)} produces both P{a, b, c) and 
-<P{a, b, c). Hence, [As] is not an interpretation. 

Note 1 ( Completeness of A ). Beyond this inconsistency problem, a complete- 
ness problem arises as well: for instance, A = {} does not produce a single 
literal. The completeness problem can be solved by adding to A an expression 
-IX, where a: is a variable; the “literal” ~ix £ A then acts as a default case to 
assign false to positive literals. Thus, for instance, H(a)}] produces every 

negative literal except ~'P{a) and thus assigns false to every positive literal, 
except PiafS 

The following dehnition formalizes these concepts. 

Definition 4 (Contradictory, Consistent, Complete). A literal set A is 
called contradictory iff there are literals L, K £ A such that L ~ K . The term 
“non-contradictory" means “not contradictory" . A is called consistent wrt. a 
literal L iff N does not produce both L and L; A is called consistent iff A is 
consistent wrt. every literal L. The term inconsistent means “not consistent" . A 
is called complete iff for every literal L, A produces L or L. The term incomplete 
means “not complete". 

® Of course, instead of “-ir”, “r” could be taken as wed, which would emphasize the 
use of negative clauses (“goals”) in the calculus. 
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Example 4 . For instance, A 3 from Example 3 is non-contradictory and inconsis- 
tent wrt. P{a, b, u) (and hence wrt. ~<P{a, b, u) as well). Adding either P{a, b, u) 
or -<P{a, b, u) renders the set consistent (and hence non-contradictory, as is eas- 
ily seen), and adding both renders the set contradictory and inconsistent wrt. 
P{a, y, u) again. Each of these sets is incomplete, and adding -^x achieves com- 
pleteness. 

With these dehnitions, the intuition so far can be made precise: 

Proposition 1 (Interpretation). If ->x £ N then N is eomplete. If N is 
eonsistent and eomplete, then |A^] is an interpretation. 

It can be noted that “productivity” and “modelship” are not related on the 
non-ground level. Eor instance, take N — {-'X, -iP(a), P{h), Q{a), ~'Q( 6 )} and 
C — P{x)y Q{x). Then |ff] |= C but there is no L £ (7 such that L produces 
C wrt. N. Conversely, N — {-'X, P{a)} produces -'P{x) but |ff] ^ ->P{x). 

As mentioned in the introduction, model candidates shall be given up when 
being “elementary contradictory” to an (instance of) an input clause. The fol- 
lowing dehnition makes this precise (preliminarily) : 

Definition 5 (Closed, Open). A literal set N is closed by a clause C and 
substitution S ijf L £~ N for every L £ CS; N is closed by (7 iff N is elosed by 
C and some substitution 5, Finally, N is elosed by a elause set S ijf N is elosed 
by some elause C d S, The term “open" means “not elosed", and “N is open 
wrt. S " means “N is not elosed by S ", 

Eor instance, N — {P{x, y), Q(a)} is closed by (7 = ~<P{x, y) V ~<P{y, x) V ~'Q{z) 
and J = {z/a}, because for every literal in C5 there is a complementary variant 
in N . 

Again, as mentioned in the introduction, the substitutions used in EDPLL 
shall be computed as most general substitutions. This holds in particular for 
the substitutions J that allow to close a literal set. This motivates the following 
dehnition. 

Definition 6 (Branch Unifier). Let (7 = Ti V • • • V T„ be a elause. A substi- 
tution cr is ealled a branch uniher of C against N iff there are pairwise variable 
disjoint literals K\, . . . , Kn N , eaeh variable disjoint from C, and sueh that 
the following holds: 

(i) (7 = s-umfy{{{!^, Ad}, . . . , {A„, Ad}}), and 

(ii) Ki produees Licr wrt. N , for 1 <i < n. 

If N is elosed by C and cr, then cr is ealled a closing braneh unifier, otherwise cr 
is ealled a falsifying braneh unifier. 

Item (i) realizes the mentioned computation at the most general level. Item (ii) 
guarantees that a clause instance Ca is identihed, such that (at least one ground 
instance of) Ca is false in |A^] (cf. Lemma 1 below). This is a useful restriction, 
as a clause instance Ca every (ground instance) of which is true in |A^] needs 
not be considered at all. 
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Example 5. Let — {P{a, y, u), ~<P{x-, b, u), P{a, b, w)} and C — ~<P{a, c, z) V 
P{z,v,z). Take Ki — P(a,y,u) and K 2 — ~'P(x ,b,u') N^. Observe 

that (7 = {u/ z, u' / z, v/b, xj z, yjc] is a branch uniher of C against since a 
is a simultaneous MGU for {{P(a, c, z), P{a, y, w)}, {~<P{z, v, z), ~<P{x, b, 
and Ki — P{a, y, u) produces P{a, c, z)cr — P{a, c, z) wrt. Nl^, and furthermore 
K 2 — ~'P{x, b, u') produces ~'P{z, v, z)cr — ~'P{z, b, z). Further observe that a is 
a falsifying branch uniher (i.e. is not closed by C and a). 

As a negative example, there is no branch uniher of P{a,b,c) against A4, 
because although item (i) in Def. 6 is satished by taking K\ — ~<P{x, b, u) and 
(7 = {x/a, w/c}, item (ii) is violated, because ~<P{x,b,u) does not produce 
~'P{a, b, c) wrt. A4. 

Branch unihers are a purely syntactical concept, and existence of branch unihers 
for finite literal sets N obviously is decidable. The following lemma then, read 
in the contrapositive direction, guarantees that A is a model for the given clause 
if no branch uniher exists (provided that N is consistent) . 

Lemma 1. Let N be eonsistent and eomplete, and C be a elause. If [A] ^ C 
then there is a braneh unifier a of C against A. 

In order to take advantage of the previous lemma, consistency has to be achieved 
(there is a respective inference rule in FDPLL). Fortunately, consistency is also a 
syntactical property, and is decidable in the hnite case as well. For the purposes 
here, the following lemma is sulhcient (it can be strengthened): 

Lemma 2. Let N be a non-eontradietory literal set. If A is ineonsistent then 
there is a pair of variable disjoint, non-eomparable literals K, L A with 
opposite sign (i.e. neither K > L nor L > K ) sueh that (i) K and L are unifiable, 
i.e. cr — unify{{K, L}) exists, and (ii) neither Kcr A nor Lcr A. 

4 The FDPLL Calculus 

In this section the inference rules based on branch unihers are introduced. Recall 
from the previous section that branch unihers are either “falsifying” or “closing” . 
However, to close literal sets earlier, hence hud shorter refutations, the FDPLL 
calculus uses a different notion of “closure” : 

Definition 7 (“-Closed, “-Open). Let a be any eonstant from the signature 
under eonsideration (or a “new” eonstant if none is supplied). By A“ denote 
the literal set obtained from A by replaeing in every literal every oeeurrenee of 
every variable by a. More formally^) L“ = L7, where 7 = {x/a \ x £ var{L){, 
and A“ = {L“ I L £ A}. _ 

The literal set A is “-closed by C and 6 iff L ^ A“, for every L £ CS. The 
derived forms, as well as the term “^-open” are defined analogously to “elosed” 
(Def 5). 

So, determing whether A is “-closed by (7 is a question of simultaneously match- 
ing all literals of C to complementary literals in A“ . 

® The fmiction var returns the set of variables occurring in its argument. 
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Note 2 (“^-Closed" Closes More Branehes). It is not too difficult to see that 
whenever N is closed by C then N is “-closed by C as well. The converse, 
however, is not true: for instance, {-iP{x ,y)} is “-closed by P{x,y) V P{x,x), 
but not closed by P{x,y)y P{x,x) (because ~<P{x, x) cannot be instantiated to 
a variant of ~<P{x, y)). 

Thus, by contraposition, if N is “-open wrt. C , N is open wrt. C as well. 

In the sequel, S always denotes a hnite clause set. 

Definition 8 (Branch, Branch Set, Selection Functions). A branch p is 
a possibly empty, finite set of literals. A branch set V eonsists of a finite set of 
branehes. The braneh set V is closed (by C , by S) iff every p is elosed (by 
C , by S ). The term “open" means “not elosed". Define V as “-closed (“-open ) 
in the expeeted way by using the ‘^-versions instead. 

Assume as given a branch selection function sel whieh maps any “'-open 
braneh set P wrt. S to one of its “-open branehes. This braneh is referred to as 
the selected branch in P. On “-elosed braneh sets, sel may be undefined. 

Finally, assume as given a literal selection function litsel{C,p) that maps a 
elause C and a braneh p that is “-open wrt. C to some literal L ^ C sueh that 
neither L p nor L p, provided sueh a literal exists, and may be undefined 
otherwise. 

Notice that the empty branch set is closed (and hence “-closed) wrt. every S, and 
that the branch set {{}} is “-open (and hence open) wrt. S, unless S contains 
the empty clause. The same holds for {{“■a:}}, as no clause contains a “literaf’ 

X. 

The purpose of the two selection functions will become clear soon. Next, the 
two inference rules of FDPLL are dehned. 

Definition 9 (Split Inference Rule). The following inferenee rule Split trans- 
forms a braneh p, a elause C sueh that p is “-open wrt. C, and a substitution cr 
into two new branehes: 



Split)^, (j) 



P 

pU{L} pU{L} 



if 



(i) cr is a branch-unifier of C against p, and 
(a) for some L £ Ccr, neither L £C p> nor 
L £C p, and 
(Hi) L — litsel(C<T,p) 



If for given p, C and cr the eonditions (i) and (ii) hold, it is said that the Split 
inferenee rule is applicable to p, C and a, and the result as the set {pU{L}, pU 
{L}} is denoted by Split(p, ( 7 , cr). The literal L in (in) is ealled the literal split 



Note 3 (Purpose o/Split^. Assume that whenever Split is applied to a branch p, 
then p is consistent (that this can be achieved is argued for below). By Propo- 
sition 1 then, [p] is an interpretation. Now, the intuition behind Split is to hnd 
a clause C and a branch uniher a of C against p, such that at least one ground 
instance of Ca is false in [p]. If no such cr exists, by Lemma 1 we can be sure that 
with [p] a model for the clause set has been found. Otherwise, Split is applicable 
to p, C and some branch uniher cr by the following line of reasoning: since p 
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is “-open, hence open (cf. Note 2), a must be a falsifying branch uniher. But 
then, condition (ii) must be satished. For, if (ii) would not be satished, for every 
literal L £ Ca it would hold (a) L £~ p or (b) L £~ p. It is impossible that 
L (ET' p for any L ^ Ca because then p would produce both L (because L (C' p) 
and L (because cr is a branch uniher of C against p, and so p produces L), and 
thus p would be inconsistent. Hence case (b) applies for every L ^ Ca, and so a 
would be a dosing branch uniher of C against p, but we know that a must be a 
falsifying branch uniher. Hence, with this contradiction condition (ii) holds. As 
a consequence, the literal selection function litsel is dehned on Ca and returns 
some arbitrarily (i.e. don’t-care nondeterministically) selected literal from Ca 
which is used for splitting p into the two new cases as stated. 

Since the use of branch unihers is insisted upon. Split is applicable in a 
very restricted way only. For instance, referring back to Example 5, Split is not 
applicable to the branch and the clause P{a, b, c). Non-applicability of Split 
realizes a search-space reduction. 

As said at the beginning of Note 3, it has to be made sure that p is consistent 
before Split is applied. This is the purpose of the following Commit inference rule 
(as the branch A 3 in Example 4 shows, consistency does not hold automatically). 

Definition 10 (Commit). The following inference rule Commit transforms a 
branch p, a literal L from p and a substitution a into two new branches: 

) (i) L (z p, and 

(iij <j — unify(fL,Kf), for some 
K p, variable disjoint from L, 

pU|L(7| pU|L(7| ■ 

. (Hi) neither La (EC p nor La (C p. 

If for given p, L and a the conditions (i) - (Hi) hold, it is said that the Commit 
inference rule is applicable to p, L and a, and the result as the set {pU{La}, {pU 
{La}} is denoted by Commit(p, L, a). The literal La is called the literal split on. 

Note 4 (Purpose of Commit). Lemma 2 states (almost directly) that Commit 
is applicable to p, for some L and a, whenever p is inconsistent. Thus, by the 
contrapositive direction, by repeated application of Commit one arrives at a 
consistent branch eventually. 

The converse of Lemma 2 is not true: {P{x,a,u), ~<P{b,y,a), ~<P{b,a,u)} is 
consistent but Commit is applicable (consider the first two literals). This shows 
that Commit is possibly applied more often than necessary. As an improvement, 
condition (iii) in Commit can be replaced by “L produces La wrt. p and K 
produces La wrt. p.” 

Definition 11 (Derivation). A derivation V (from a clause set S) is a (possi- 
bly infinite) sequence of branch sets T> — ifPo — {{~'^}}),Pi, ■ ■ ■,Pn, ■ ■ ■ , such 
that for i > 0, 

(i) Vi+i — {Vi \ {pi}) U Split(pi, (7, (j) for some clause C (E S and substitution 
a, or 

(ii) Vi+i — \ {P*}) Cl Commit(pi, L, a) for some literal L and substitution a, 
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where in both eases Vi is °'-open wrt. S and pi — set (Vi) is the seleeted (henee 
°'-open) braneh in Vi. A derivation is ealled a refutation (of S) iff some Vi is 
°'-elosed (by S). A derivation of Vn is a finite derivation that ends in Vn- 

Both Split and Commit are applied to “-open branches only. Thus, if some Vi is 
“-closed, then Vi contains no single “-open branch and the derivation stops as a 
refutation. 

Example 6 ( Derivation ). The hgure below shows in tree notation a sample deriva- 
tion from the clause set S consisting of the clauses C\ — P{a,y) and C 2 — 



P{x, b) V ~'P{z, 


y) V Q{x,y, z). 






Vi: 


V 2 : 


V 3 : 


Vi. 


->x 


-iX 


->X 


-iX 


P{a,y) -iP{a,y) 


P{a,y) ->P{a.,y) 


P(a,y) -iP{a,y) 


P(a,y) -iP{a,y) 


ic 








Pi P2 


P[x,b) -iP[x,b) 


P[x,b) -iP[x,b) 


P[x,b) -iP[x,b) 




P3 PA 


P{a,b) -iP{a,b) 
P5 P6 


P[a,b) -iP(a,6) 

Q{x,y,a) -iQ{x,y,a) 

-k 

P7 PS 



The branch set Vo — is not depicted; “-closed branches are marked 

with a Vi is obtained from Vo by a applying Split to {{“■a?}} and C\ (and 
the empty substitution); the branch p 2 is “-closed (even closed) due to Ci] V 2 
is obtained from Vi by applying Split to p\ and C2 (and some cr not made 
explicit). Neither Split nor Commit is applicable to ps, so [ps] is an interpretation 
(cf. Note 4) and [ps] |= S (cf. Note 3). To continue the example suppose that 
the branch p 4 is selected in V2- The Commit rule is applicable to p 4 , which 
derives the branches ps and pg. Applying Split to ps and C 2 yields pr and pg (as 
branch literals in the computation of the branch uniher use variants of P{a, y) 
and -iP(a?, 6)).The branch pg is “-closed by C 2 , and to pr, neither Commit nor 
Split is applicable (in particular, the instance P{x, b) V ~<P{a, b) V Q{x-, b, a) of 
C 2 which can be obtained by simultaneous unifying away the P-literals of C 2 
against P{a,b) and -<P{x-(b) is produced by Q{x,y,a)). Thus, [pr] |= S. The 
derivation can continue with Vi at the branch called pg in V3, which is not shown 
here. 

Note 5 (Regularity). For both inference rules, when applied to a branch p, the 
literal L split on is “new” to p in the sense that neither L nor L is contained in 
p, not even as a variant. In Split, a respective condition in litsel, and in Commit 
condition (iii) is responsible for this. In other words, a stronger form of “regu- 
larity” than the identity-based one used in rigid variable calculi holds; also, it is 
impossible to derive a contradictory branch (cf. Def 4). Thus, from the complete- 
ness of FDPLL (Theorem 2) it follows immediately that FDPLL is a decision 
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procedure for Bernays-Schonfinkel logic (no function symbols except constants) , 
a class that cannot be decided easily by resolution or tableau methods. 

This section is concluded with optimizations concerning Split, in particular 
the literal selection function Utsel (Def. 8) . The basis is the following lemma: 

Lemma 3 (Open Path Literal Selection). Let p be an °'-open braneh wrt. 
S, C (z S be a elause and a be a braneh unifier of C against p. Then, all of the 
following hold; 

(i) Split is applieable to p, C and a. 

(ii) The set £ = {L £ Ccr \ ^ p“} is non-empty. 

(iii) L ^ p, for every L C. 

(iv) Ifp is eonsistent, then L p, for every L £ Ca. 

It is clear from the dehnition of “derivation” that Split is applied to a branch 
p only if p is “-open. So, this precondition of the lemma is satished whenever 
Split is attempted. Now assume that C and a exist as required in the lemma 
statement, and thus that items (i) - (iv) hold. 

Item (i) summarizes what was sketched at the beginning of Note 3. 

Suppose that Commit applications are preferred to Split applications (as is 
realized in the proof procedure in Section 7 below). Then, the branch p is consis- 
tent (cf. Note 4), and item (iv) shows that the condition (ii) in the Split inference 
rule may be equivalently replaced by “for some L ^ Ca, L p” (since by item 
(iv) L p holds for all literals L £ Ca). Item (iv) is proven as follows: we 
are given that cr is a branch-uniher of C against p. This means in particular 
that each literal L, where L £ Ca, is produced by some literal K £~ p. Now, if 
L p would hold, then p would produce L as well. By consistency, however, p 
cannot produce L. 

Next, sensible literal selection by Utsel is turned to. Concretely, litsel{Ccr,p) 
should return an element L from the (non-empty) set C dehned in item (ii). 
Observe that splitting on this literal L is indeed possible, because by item (iii) 
the modihed applicability condition of Split explained in the previous paragraph 
is satished for L. Now, item (ii) expresses that one need not select a literal from 
Ccr that is solved in the sense that it contributes to “-close p. For instance, if 
p — {-IX, P{x, y)}, C — ~'P{x, x) V ~'P{x, b) and cr — e, then C — {-iP{x, b)}. 
It is more sensible to select ~iP{x, b) for splitting, because the thus upcom- 
ing branch {-^x, P{x,y), P{x,b){ is “-closed, whereas splitting with ~iP{x,x) 
leaves the upcoming branch {-^x, P{x, y), P{x, x){ “-open; with respect to clos- 
ing branches, P{x, y) and P{x, x) are the same. Selecting literals from C yields 
shorter refutations. 

5 Universal Literals 

Under certain circumstances a literal L occurring in a branch p can be treated, 
roughly, like a unit clause in resolution, i.e. L then stands for all its ground 
instances — without exception. In the terminology used here, such a literal L 
that is universal in a braneh p then produces all instances of L, and any extension 
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of p also containing a literal K with L > K can be considered as closed, without 
any explicit refutation. Thus, many inferences can be saved by closing branches 
earlier. 

The diihculty is to determine criteria under which it is sound to consider L as 
universal in p; since branches can only close earlier, but never later, completeness 
is preserved trivially, and only soundness is an issue. Possible criteria to derive 
universal literals correspond to “unit-resulting resolution” and splitting based 
on variable disjoint subclauses. This is sketched next 

The starting point is a stronger condition (than “-closed) to close branches: 
assume that a set of universal literals in p, Univ{p) C p, has already been 
determined, and let (7 be a clause. Now, p is said to be closed^ by C iS C can be 
partitioned as C — Ci U C2 such that (i) there is a simultaneous most general 
uniher a of every L E C± with some literal K s.t. K Univ{p) (use a new 
variant), and (ii) there is a substitution J such that L 5 £ (p\ Univ{p))°' for every 
L (E C2cr (in particular, if C2<r — {} then 6 — e exists trivially). In other words, 
condition (ii) is the same as saying that p\ Univ{p) has to be “-closed by C2<t. 

For example, the branch p4 in Example 6 is closed* (but not “-closed) by 
C — ~'P{y, c) V P{x, b), using C\ — ~'P{y, c) and C2 — P{x, b). 

Now, if (i) holds but (ii) does not hold for the considered clause C and branch 
p, then, as said, p\ Univ{p) is not “-closed by C2cr, and there is no dosing branch 
uniher of C2<r against p \ Univ{p). But it is still possible that there is a falsifying 
branch uniher d of C2<r against p\ Univ{p). In order to describe how universal 
literals are derived, assume that this is the case. 

It follows that there is at least one literal L £ C2<t(t' such that neither L (P' p 
nor L (Ep (for this, it has to be assumed that C\ was chosen maximal, which is 
a safe assumption). In other words. Split is applicable to p, C2 (and also to C) 
and ad with literal L split on. Now, if L is variable disjoint from the rest, i.e. 
if var[L) fl var{C2aa' \ {T}) = {} holds, then L is determined to be universal 
in p U {T}, otherwise it is not universal in p U {T}. In the other branch p U {T} 
coming up in the Split application, the universal literals are just those of p. 

The improvements possible by universal literals can now be summarized as 
follows: expressed abstractly, since less “open*” than “-open branches exist, their 
stronger properties can be taken advantage of. For instance. Commit is never 
applicable when both K and L are universal in p (cf. Def. 10 ); also, if Split is 
applicable to p, C and aa' , always a literal L £ Caa' exists such that neither 
L nor L is subsumed by a universal literal in p (Formally, Lemma 3 can be 
strictly strengthened). In other words, instances of universal literals (or their 
complements) need never be added by a Split. This mirrors “subsumption by a 
unit clause” in resolution. 

Notice as special cases that a ground literal L is trivially universal in every 
branch containing it, and, more importantly, if C2aa' — {L , . . . , L} is a singleton 
when read as a set, then the “other” branch pU{L} is closed* (such as the branch 
P2 in Example 6 ). Hence, no branching is introduced then. It is not too difl&cult to 

^ The long version of the paper contains a full account. 
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see that as a consequence, when applied to a Horn clause set, FDPLL specializes 
to positive Hyper-resolution. 

It is worth noting that the described technique of reasoning with universal 
literals can be built-in by only minor modihcations of the calculus, by attaching 
boolean labels to the literals in branches to indicate their “universal” status. 

6 Soundness and Completeness 

Theorem 1 (Soundness of FDPLL). Let S be a elause set and V be a refu- 
tation of S. Then S is unsatisfiable. 

Proof sketch: consider the last element V in T>, every branch of which is “-closed. 
Its ground instantiation P® = {p“ | p £ P} can be seen as a usual semantic tree T 
made up of splits with complementary ground literals (cf. [CL73]). Furthermore, 
the (hnite) set of all those (ground) clauses C5 that are used for closing the 
branches in P“ show that every leaf in P is a failure node. Now apply the 
soundness result for usual semantic trees. 

Next we turn to completeness. The FDPLL calculus proceeds by further 
modifying one single branch set, and never “backtracks” to a previously derived 
branch set. This notion of derivation indicates that to obtain a completeness 
result, fairness is to be dehned as an exhaustive process (up to redundancy) of 
inference rule applications. 

Before going into the details, a general note on this topic can be made: the 
notion of “derivation” in Def. 11 can be adapted to virtually every confluent 
rigid variable method. For these methods, the real challenge is to deflne fairness 
as just mentioned in such a way that it can be turned into an eifective proof 
procedure. Only few attempts in this direction have been made [BEF99,Bec98]. 
Coming from the FDPLL side, it seems possible to bring the technique here to 
e.g. clausal tableaux calculi (by branching on clauses instead of complementary 
literals) . 

Definition 12 (Path). Let V be a derivation that is not a refutation, written 
as in Definition 11. Let of all branehes ever eonstrueted 

in P. A path (of P) is a possibly infinite sequenee I — (po = {“'i*^}) C pi C • • • C 
Pm C • • • of branehes in P“ sueh that for every i >0 

(i) Pj = selifPsf) is the seleeted (°'-open) braneh in some braneh set Vs, in P, 
and 

(ii) pi+i = Pi U {Li}, for some literal Li, and 

(iii) if in P the Commit or the Split inferenee rule is applied to pi, then I 
eontains a sueeessor element pi j,-i. 

Finally, define the ehain limit U/ = u i>oPi' 

The limit U/ thus is an “infinitely long” branch, obtained by tracing some branch 
that is extended infinitely often and remains “-open. As an important property, 
U/ is “-open, because if U/ were “-closed, there is a clause C “-closing U/. Since 
clauses are finite and U/ is the limit of a chain, some finite pj C U/ would 
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be “-closed by C, contradicting item (i) in the definition. It is only noted here 
without proof that for every derivation V that is not a refutation a path of V 
exists. 

Definition 13 (Finishedness, Fairness). Let V be a derivation that is not 
a refutation and I be a path ofT>, written as in Definition 12. The path I is 
finished ijf for every i >0 the following holds: 

(i) if the Split inferenee rule is applieable to pi and some elause C d S and 
substitution cr, then Ccr is produeed by pj (Def 2) for some j > i. 

(ii) */ the Commit inferenee rule is applieable to pi and some literal L £ pi and 
substitution cr, then pj is eonsistent wrt. Lcr, for some j > i. 

D is fair ijf (i) D is a refutation or ( ii ) some path of D is finished. 

The purpose of condition (ii) in Definition 13 is to achieve that U/ is consistent 
wrt. any literal La identified by a possible Commit application at time point 
i; similarly, the purpose of condition (i) in Definition 13 is to achieve that U/ 
produces the clause instance Ca identified via a possible Split application at time 
point i. 

That these purposes can be satisfied is a consequence of having a U/ as a 
ehain limit (roughly, compactness wrt. the required properties holds then) and 
that Split and Commit applications achieve (i) and (ii), respectively, whenever 
violated (cf. Notes 4 and 3 again). However, aetually carrying out the inferences 
is only the last resort: it suffices that their effeet is achieved, namely (in case of 
Split, e.g.) that the clause Ca is produced eventually. Indeed, a Split application 
might cause a former possible Split application to be impossible. For instance, 
when a clause — ~<P{a, b) V R{a) is added to the clause set in Example 6, 
then Split is applicable to pi and Cs (with a — {x/R{a), y/b}), but Split is no 
longer applicable to Cs and pe (due to ~<P{a, b) £ pe). 

These considerations shall serve as a proof sketch for the first main result: 

Theorem 2 (Completeness of FDPLL). Let S be a elause set and D be a fair 
derivation from S. IfD is not a refutation then S is satisfiable. More speeifieally, 
for every finished path I ofD, |U/] is an interpretation and |U/] j= S. 

Notice that in the contrapositive direction, the theorem is just a refutational 
completeness result. 

A final remark: the calculus is asymmetric wrt. the role of branches. Consider 
e.g. a branch p — {-'X, P{x), ~<P{a)}. To determine closure, p“ is used, and to 
extract a model [p] is used; inconsistency ofp“ (as in the example) is not relevant 
for model extraction - P{a) is simply false in [p]®. For Theorem 2 to hold there 
is no need to go beyond the Herbrand interpretations as stated in Section 2. 



The calculus can be slightly improved by considering p as closed if p“ is contradictory 
(as in the example), although p is “-open. 
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7 Proof Procedure 

So far, fair derivations are purely abstract mathematical objects, and it still has 
to be demonstrated that an effective fair strategy exists. In essence, to guarantee 
fairness, a maximal term depth bound is used, which all literals to be split on in 
Split applications have to obey; starting with a small natural number, the value 
of this bound is increased only after having exhausted Split within the current 
value. This inner loop exhaustion always terminates, essentially because variants 
of literals already present in a branch are never added. Fortunately, Commit need 
not be subject to such a term depth bound check — it finitely exhausts on any 
(hnite) literal set. 

These are the essential ingredients of the following concrete proof procedure. 
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funct FDPLLiS) = 

var Z; % for the model representation 

funct Satisfiable{p, bound) = 

% p: the current branch, bound: non-negative integer, the maximal term 
% depth admissible in hterals for sphtting 
ir^-Closed(p,5) 
then return false 

else % Try a Commit, First, collect all candidate hterals in C: 
var C := {La | L G p, 3/C ^p : a — unify{{L^ new{K)y) 
undefined A La ^ p t\ La p}; 

then % Commit is applicable 

var Lc rf L £ C : true; % Select any candidate 

if Satisfiable(p U {Lc}, hound) % Left branch extension 

then return true 

else return Satisfiahle(p U {Lc}, bound) % Right branch 

fi 

else % Commit not applicable - try Split. Collect aU candidates: 
var C {L\ 3C £S, <j £ B ranch Uni fy{C,p) : 

L — litsel(C<T,p)} 

then I p; % Split is not applicable - got a model in p. 
return true 

else % £ ^ {}, so Split is applicable 

var Ls rj L' £ C : ||L'|| < bound; 

% Select any candidate within depth bound. 

% However, it might not exist: 

if Ls 7 ^ undefined 

then % Candidate within depth bound exists 
if Satisfiable(p U {L^}, bound) 

then return true 

else return Satisfiahle(p U {Ls}, bound) 

fi 

else % Hit depth bound - try with higher one: 
return Satisfiable(p, bound + 1) 

fi fi fi fi. 



40 



% Body of FDPLL: 
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4-1 if Satisfiahle({-ix} , 0) 

42 then return X 

43 else return false 

44 fi. 

Some functions remain unspecified: a call of Closed (p,S) is supposed to re- 
turn true ilfp is “-closed by S (cf. Def 5); new{L) is supposed to return a “fresh” 
variant of L, containing no variables used so far. A call of BranchUnifyiC ,p) is 
supposed to return a possibly empty, finite and complete set of branch unifiers of 
C against p\ Finally, ||L|| denotes the depth of |L| as a tree. All these functions 
can be elfectively implemented. 

Some comments on the structure of the procedure: FDPLL is a wrapper 
around 

Satisfiable. The p parameter of Satis fiable is the currently selected branch in 
an implicitly constructed derivation. The branch selection realized in Satisfiable 
is implicitly a left-to-right depth-first strategy. 

The counterpart of “-closed branches in the definition of “derivation” is a 
return value of false in Satisfiable; thus, closed branches are not kept in memory, 
and so Satisfiable realizes a space elficient one-branch-at-a-time approach. 

If every incarnation of Satisfiable returns false, then FDPLL returns false 
as well, indicating unsatisfiability of S. FDPLL returns a model I only if some 
incarnation of Satisfiable returns true on line 25, because a return value of true 
in Satisfiable is immediately propagated to its caller. This happens only if neither 
Commit nor Split are applicable to p. That I S holds then follows directly 
from Notes 3 and 4. 

Now some more specific comments on Satisfiable: the set comprehension for- 
mula on line 11 is just the applicability condition of Commit. An //-expression 
rf L E C : p{L) (as on lines 15 and 27) returns any L E C such that p{L) holds, 
if such an L exists, and returns undefined else. The set C on line 21 is assigned 
a finite set of literals such that whenever Split is applicable to some C E S and a 
then litsel{C(T,p) E'^ C. When reaching line 21, p is known to be “-open (because 
Line 9 was not reached) and consistent (Note 3 account for this fact). Hence, all 
of Lemma 3 is applicable, and the improvements for litsel discussed there can 
be taken advantage of. 

The bound parameter of Satisfiable realizes a depth bound, which all literals 
to be split on in Split applications have to obey. The fairness of the procedure is 
guaranteed by only increasing bound (on line 37) after exhaustion of Split on the 
currently given value of bound. However, Commit need not be subject to such a 
depth bound check — it finitely exhausts on any finite literal set. 

In order to save space and concentrate on the most essential issues to be 
contributed here, the FDPLL procedure just described does not use universal 
literals (cf. Section 5). The long version of the paper contains a procedure with 
universal literals in full detail, which also realizes the restrictions of Commit and 
Split mentioned at the end of Section 5. A further built-in improvement is to 
remove a split rule application from a derivation if in the left derived branch 



l.e. it contains modulo renaming every branch unifier of C against p. 
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p U {L} the literal L is not needed in the refutation of p U {L}. Hence the 
right branch p U {L} need not be considered. This well-known improvement^'^ 
realizes (but is more powerful than) the purity rule of propositional DPLL. 
The implementation mentioned in Section 1 refers to the version with these 
improvements. 

Also for space reasons, a correctness proof of FDPLL is not possible here. The 
comments above and in the program may serve as a rough sketch for the following 
main result. Notice in particular item (iv), stating explicitly the existence of a 
fair derivation, which was left open in Section 6. 

Theorem 3 (Correctness of the FDPLL Procedure). The FDPLL pro- 
cedure (with or without universal literals) has the following properties, for any 
clause set S: 

(i) Refutational soundness: If FDPLLfS) returns false then S is unsatisfi- 
able, 

(ii) Refutational completeness: If S is unsatisfiable then FDPLLfS) returns 
false. 

(ill) Model soundness: If FDPLLfS) returns a literal set I then [I] is an 
interpretation and m N <5- 

(iv) Finishedness: If FDPLL(S) does not terminate, there is an infinite se- 
quence of incarnations Satisfiable{pi,bi) (fori > 0) such that the sequence 
I — (po = {“'iJ^}) C Pi C • • • C Pm C • • • is a finished path of some fair- 
derivation D of S which is not a refutation. 

8 Conclusions 

A directly lifted, confluent and strongly complete version of the DPLL proce- 
dure has been presented. As the theoretical concepts (in particular the model 
representation) are new, emphasis was put on these rather than on experimental 
studies. 

Related Work. Already in [CL73] a lifted DPLL calculus can be found. It uses 
the device of “pseudosemantic trees”, which, like FDPLL, realizes splits at the 
non-ground level. Nethertheless, the pseudosemantic tree method is very diifer- 
ent: in sharp contrast to FDPLL, a variable is treated rigidly there, i.e. as a 
placeholder for a (one) not-yet-known term. As a consequence, like in all rigid 
variable methods, only a very weak regularity condition can be used (cf. Note 5 
below). Furthermore, only a weak completeness result is known, which translates 
into a heavily backtracking oriented proof procedure only. 

When FDPLL reports “satisflable” , a model representation has been com- 
puted without further processing. This does neither hold for the mentioned 
method in [CL73], nor for the resolution based methods for model computation 
in [FL93,FL96]. Typically, the latter attempt to compute a model by enumerat- 
ing all true ground literals, thereby interleaving this enumeration with calls to 
the resolution procedure again in order to determine the “next” ground literal. 

Also known as “dependency directed backtracking”, “level cut”, “condensing” etc. 
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The probably most advanced first-order tableau system tailored for model 
computation is Ramcet [Pel99], which is a successor of Peltier’s and his co- 
workers resolution calculus and previous tableaux calculi (see e.g. [CP95]). As 
a drawback, Ramcet needs additional inference rules for model computation. 
In particular, the “model explosion” rule seems problematic, as it branches out 
wrt. the whole signature of the formula set under consideration. FDPLL does 
not need such a rule. Furthermore, unlike for FDPLL, strong completeness for 
Ramcet is still unsolved (while proof confluence is trivial), in the sense that 
no effective fair strategy is known except a trivial one, which needs exponential 
space — a widespread problem of tableaux calculi. 

In the literature several methods are described that are related to FDPLL 
in the sense that variables are treated in a similar way (cf. “description of FD- 
PLL” in Section 1). FDPLL was influenced and is intended as a successor of the 
hyper tableau calculus [Bau98] (which in turn is a successor of the calculus in 
[BFN96], a calculus in the tradition of Satchmo[MB88]). Among other things, 
FDPLL improves on this calculus by not needing to store instances of clauses 
as the derivation proceeds - only the “current interpretation” needs to be kept. 
Beyond this, FDPLL is conceptually different: like any tableaux calculus, hy- 
per tableau branches on (sub-)formulas, whereas FDPLL branches in a binary 
way on complementary literals, i.e. uses “cut” as the single inference rule. The 
latter is more general and “builds in” standard improvements like factorization 
automatically. 

What was said about hyper tableau applies equally to the disconnection 
method [Bil96]. Also, no model computation result was given for this calculus. 

Also related are Plaisted’s hyper-linking calculi: the semantic hyper-linking 
calculus (SHL) [LP92] proceeds by searching in a guided way for (not necessarily 
ground) instances of input clauses, which are tested for unsatishability by a 
propositional DPLL procedure. Much of what was said about hyper tableau 
above applies to this calculus as well. In particular, unlike SHL, FDPLL does not 
interleave two processes “clause instance generation” and “propositional DPLL” . 
The former process occurs in FDPLL only “locally” within the splitting rule, 
and derived clause instances need not be kept. It seems worth to investigate 
combinations of SHL and FDPLL, e.g. by replacing propositional DPLL in SHL 
by FDPLL, or picking up the idea of guided instance generation in SHL to 
improve FDPLL. However, this is future work. 

Quite different is the ordered semantic hyper linking (OSHL) calculus [PZ97]. 
OSHL has many interesting features, for instance “semantical guidance” . In the 
intersection with FDPLL, it can be described as a calculus that applies unit- 
resulting resolution as long as possible, and then splits with a ground literal in 
order to begin the next round. The main motivation for FDPLL however was to 
get rid of such ground splits. Therefore, it seems realistic to possibly improve on 
OSHL by bringing in the non-ground splitting technique of FDPLL. 

Future Work. As always, much remains to be done: a part of my FDPLL re- 
search plan is an effieient implementation. So far, only a slow and prototypieal 
implementation in Prolog exists (available from my home page). Although it 
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lacks such crucial features like term indexing, the performance seems promising, 
in particular for satishable or non-Horn problems without equality (there is no 
built-in equality treatment yet) . In the respective subdivisions SAT and NNE of 
the CASC-16 system competition 1999, FDPLL scored rank 4 of 6 and rank 4 of 
10, respectively. From the TPTP library [SSY94], FDPLL can also solve some 
diihcult unsatishable problems quite quickly (e.g. ANA002-4, the intermediate 
value theorem, in 3 seconds). The overall success rate is about 40% (Otter: 52%) 
for a time limit of 10 minutes. 

Other sources for future work are combinations of the techniques described 
here with hyper-linking calculi, equality treatment and improved termination 
behavior to name a few. On the theoretical level, the relationship between the 
model representation capabilities in FDPLL and the atomic model representa- 
tions [GP98] used in the resolution and tableau world should be clarihed. 

Acknowledgments. I am grateful to the members of our group and to David 
Plaisted for discussions about FDPLL and comments on the paper. Three refer- 
ees gave valuable comments and suggestions, in particular concerning improve- 
ments of the Commit rule and universal literals. 
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Abstract. This paper presents a sound and complete set of abstract 
transformation rules for rigid A- unification. Abstract congruence closure, 
syntactic unification and paramodulation are the three main components 
of the proposed method. The method obviates the need for using any 
complicated term orderings and easily incorporates suitable optimiza- 
tion rules. Characterization of substitutions as congruences allows for a 
comparatively simple proof of completeness using proof transformations. 
When specialized to syntactic unification, we obtain a set of abstract 
transition rules that describe a class of efficient syntactic unification al- 
gorithms. 



1 Introduction 

Rigid ^-unification arises when tableaux-based theorem proving methods are 
extended to logic with equality. The general, simultaneous rigid ill-unification 
problem is undecidable [7] and it is not known if a complete set of rigid E- 
unifiers in the sense of [10] gives a complete proof procedure for first-order logic 
with equality. Nevertheless complete tableau methods for first-order logic with 
equality can be designed based on incomplete, but terminating, procedures for 
rigid if-unification [8] . A simpler version of the problem is known to be decidable 
and also NP-complete, and several corresponding algorithms have been proposed 
in the literature (not all of them correct) [9, 10, 5, 8, 11, 6]. In the current paper, 
we consider this standard, non-simultaneous version of the problem. 

Most of the known algorithms for finding a complete set of (standard) rigid 
unifiers employ techniques familiar from syntactic unification, completion and 
paramodulation. Practical algorithms also usually rely on congruence closure 
procedures in one form or another, though the connection between the various 
techniques has never been clarified. The different methods that figure promi- 
nently in known rigid unification procedures — unification, narrowing, superposi- 
tion, and congruence closure — have all been described in a framework based on 
transformation rules. We use the recent work on congruence closure as a start- 
ing point [12, 4] and formulate a rigid if-unification method in terms of fairly 
abstract transformation rules. 

* The research described in this paper was supported in part by the National Science 
Foundation under grant CCR-9902031. 
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This approach has several advantages. For one thing, we provide a concise 
and clear explication of the different components of rigid if-unification and the 
connections between them. A key technical problem has been the integration of 
congruence closure with unification techniques, the main difficulty being that 
congruence closure algorithms manipulate term structures over an extended sig- 
nature, whereas unifiers need to be computed over the original signature. We 
solved this problem by rephrasing unification problems in terms of congruences 
and then applying proof theoretic methods, that had originally been developed 
in the context of completion and paramodulation. Some of the new and im- 
proved features of the resulting rigid A-unification method in fact depend on 
the appropriate use of extended signatures. 

Almost all the known rigid A-unification algorithms require relatively com- 
plicated term orderings. In particular, most approaches go to great length to 
determine a suitable orientation of equations (between terms to be unified), 
such as a; « fy, a decision that depends of course on the terms that are substi- 
tuted (in a “rigid” way) for the variables x and y. But since the identification 
of a substitution is part of the whole unification problem, decisions about the 
ordering have to made during the unification process, either by orienting equa- 
tions non-deterministically, as in [10], or by treating equations as bi-directional 
constrained rewrite rules (and using unsatisfiable constraints to eliminate wrong 
orientations) [5]. In contrast, the only orderings we need are simple ones in 
which the newly introduced constants are smaller than all other non-constant 
terms. The advantage of such simple orderings is twofold, in that not only the 
description of the rigid A-unification method itself, but also the correspond- 
ing completeness proofs, become simpler.^ Certain optimizations can be easily 
incorporated in our method that reduce some of the non-determinism still inher- 
ent in the unification procedure. The treatment of substitutions as congruences 
defined by special kinds of rewrite systems (rather than as functions or mor- 
phisms) is a novel feature that allows us to characterize various kinds of unifiers 
in proof-theoretic terms via congruences. 

As an interesting fallout of this work we obtain an abstract description of 
a class of efficient syntactic unification algorithms based on recursive descent. 
Other descriptions of these algorithms are typically based on data structures 
and manipulation of term dags. Since our approach is suitable for abstractly 
describing sharing, we obtain a pure rule based description. 

One motivation for the work presented here has been the generalization of 
rigid A-unification modulo theories like associativity and commutativity, which 
we believe are of importance for theorem proving applications. Our approach, 

^ A key idea of congruence closure is to employ a concise and simplified term repre- 
sentation via variable abstraction, so that complicated term orderings are no longer 
necessary or even applicable. There usually is a trade-off between the simplicity of 
terms thus obtained and the loss of term structure [4]. In the case of rigid unification, 
we feel that simplicity outweighs the loss of some structure, as the non-determinism 
inherent in the procedure limits the effective exploitation of a more complicated term 
structure in any case. 
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especially because of the use of extensions of signatures and substantially weaker 
assumptions about term orderings, should more easily facilitate the development 
of such generalized unification procedures. 

We also believe that our way of describing rigid if-unification will facilitate a 
simpler proof of the fact that the problem is in NP. Previous proofs of member- 
ship of this problem in NP “require quite a bit of machinery” [10]. The weaker 
ordering constraints, a better integration of congruence closure and a rule-based 
description of the rigid if-unification procedure should result in a simpler proof. 

2 Preliminaries 

Given a set if = U„i7„ and a disjoint set V, we define T(S, V) as the smallest 
set containing V and such that G T{S,V) whenever / G and 

ti, . . . , G T{S, V). The elements of the sets S, V and T{E, V) are respectively 
called /unction symbols, variables and terms (over E and V). The set E is called 
a signature and the index n of the set E„ to which a function symbol / belongs 
is called the arity of the symbol /. Elements of arity 0 are called constants. By 
T{E) we denote the set T{E,%) of all variable-free, or ground terms. The symbols 
s,t,u, . . . are used to denote terms; f,g, ■ ■ ., function symbols; and x, y,z , . . ., 
variables. 

A substitution is a mapping from variables to terms such that xa = x for 
all but finitely many variables x. We use post-fix notation for application of 
substitutions and use the letters <t, 0, ... to denote substitutions. A substitution a 
can be extended to the set T {E, V) by defining f{ti , . . . , i„)(T = f{tia , . . . , t„(r). 
The domain T>om{a) of a substitution a is defined as the set {x G V \ xa ^ a;}; 
and the range TZan(a) as the set of terms {xa : x G T>om{a)}. A substitution a 
is idempotent if aa = a.^ 

We usually represent a substitution a with domain {x \, . . . , x„} as a set of 
variable “bindings” {x\ i-^- ti,...,Xn tn}, where ti = Xia. By a triangular 
form representation of a substitution a we mean a sequence of bindings, 

[Xil-^ti ; X2^t2] . . . ; Xn^tn], 

such that a is the composition a\a 2 ■ ■ - an of substitutions ai = {xi i-^- U}. 



Congruences 

An equation is a pair of terms, written as s « t. The replacement relation ^es 
induced by a set of equations E is defined by: u[^j ^es u[r] if, and only if, I « r 
is in E. The rewrite relation — > e induced by a set of equations E is defined by: 
u[la] ->-E u[ra] if, and only if, I « r is in E and a is some substitution. In other 
words, the rewrite relation induced by E is the replacement relation induced by 
UaEa, where Ea is the set {sct « ter : s « t G A}. 

^ We use juxtaposition err to denote function composition, i.e., x(ar) = (xa)r. 
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If ^ is a binary relation, then <— denotes its inverse, <-> its symmetric closure, 
its transitive closure and its reflexive-transitive closure. Thus, 
denotes the congruence relation^ induced by E. The equational theory of a set 
E of equations is defined as the relation Equations are often called rewrite 
rules, and a set E a rewrite system, if one is interested particularly in the rewrite 
relation rather than the equational theory 



Substitutions as Congruences 

It is often useful to reason about a substitution a by considering the congruence 
relation induced by the set of equations E„ = {xa ^ x : x G T>om{a)}. 

The following proposition establishes a connection between substitutions and 
congruences. 

Proposition 1. (a) For all terms t G T{E,V), t ta. Therefore, Ea C 
^{EuE ) 9 - (^) U substitution a is idempotent, then for any two terms s,t G 
T{E, V), we have sa = ta if, and only if, s t. 

Proof. Part (a) is straight-forward and also implies the “only if” direction of 
part (b). For the “if” direction, note that if u then u = u[l] and v = v[r] 

for some equation ^ « r or r « Hn E„. Thus, ua = (u[^])cr = ua[la] ^(E„a)si 
ua[ra] = {u[r])a = va. Therefore, if s t, then sa o -)9 ^ 

idempotent, then E^-a consists only of trivial equations t ks t, and hence sa and 
ta are identical. 



Theorem 1. Let a be an idempotent substitution and [a;i i— > ti; . . . ; a;„ i -^- be 
a triangular form representation of a. Then the congruences ^*^g and 
are identical, where ai = {xi i-^- U}. 

Proof. It is sufficient to prove that E„ C )p and E„^ C ^*^g for all 

1 < z < n. If Xia « Xi is an equation in Ea, then 

Xia — Xiai . . . a^i — xiai . . . a^i — tiai.\-\ . . . a^, 

and therefore, using Proposition I part (a), 

^ ^ ti ^ ^ jpg tiai-^l ^ ^ jpg * * * ^ ^ jpg tiai-^l . . . — Xia. 

”i <^i+2 

For the converse, using part of the above proof, we get U )s ^*e” 

Xi. By induction hypothesis we can assume, Ea-,. Q k > i, and then 

the above proof would establish C ^*^g. 

® A congruence relation is a reflexive, symmetric and transitive relation on terms that 
is also a replacement relation. 
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The theorem indicates that if an idempotent substitution a can be expressed 
as a composition (Ti(T 2 . . .cr„ of finitely many idempotent substitutions Ui with 
disjoint domains, then the congruence induced by is identical to the congru- 
ence induced by We denote by E„- the set UiE^^. 

The variable dependency ordering >-y induced by a set S of equations on the 
set V of variables is defined by: x >-y y if there exists an equation t[x]~ y in S. 
A finite set S of equations is said to be substitution-feasible if (i) the right-hand 
sides of equations in S are all distinct variables and (ii) the variable dependency 
ordering )^y induced by S is well-founded. If S' is a substitution- feasible set 
{ti Xi : 1 < i < n} such that xj Xi whenever i > j, then the idempotent 
substitution a represented by the triangular form [x\ i-^- ti;...;Xn tn] is 
called the substitution corresponding to S. Given an idempotent substitution a 
and any triangular form representation a\a 2 ■ ■ ■ the sets E„ and UiEu^ are 
substitution- feasible. 



Rigid £l-Unification 

Definition 1. Let E be a set of equations (over E\JV) and s and t be terms in 
T{E, V). A substitution a is called a rigid E-unifier of s and t if sa ta. 

When if = 0, rigid if-unification reduces to syntactic unification. 

Theorem 2. An idempotent substitution a is a rigid E-unifier of s and t if and 
only if s ^Ieue„)o 

Proof. Let a be an idempotent substitution that is a rigid if-unifier of s and t. 
By definition we have sa Using Proposition 1 part (a), we get 

^IeuE„)s ^*Ei 

Conversely, suppose a is an idempotent substitution such that s ^*eue )s> 
Then, sa ^(^euue <t)s Uut {E„)a consists of trivial equations of the form 
t PS t and hence we have sa ^*eo-)s 

If the substitution a is not idempotent, the above proof does not go through 
as the set {E„)a may contain non-trivial equations. However, we may use The- 
orem I to establish that the congruences induced by if U E„ and EU Ee, where 
01 ... is a triangular representation for a, are identical. 

We obtain a characterization of standard if-unification if we replace the con- 
gruence induced by if U Ea- by the congruence induced by Ua-Ea U Ea- in the 
above theorem, and a characterization of syntactic unifiers if if = 0. 



Orderings on Substitutions 

Unification procedures are designed to find most general unifiers of given terms. 
A substitution a is said to be more general than another substitution 6 with 
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respect to a set of variables V, denoted by ct 0, if there exists a substitution 
a' such that xcra' = x9 for all x GV. 

A substitution a is called more general modulo on V than 9, denoted 
by cr :<^g 9, if there exists a substitution cr' such that xaa' ^{Ee)n 
X G V. We also define an auxiliary relation C between substitutions by a 9 
if xa x9 for all x G V. Two substitutions a and 9 are said to be equivalent 

modulo on V a a <"^g 9 and 9 <"^g a. 

If (T is a rigid if-unifier of s and t, then there exists an idempotent rigid 
if-unifier of s and t that is more general modulo E^ than a. Hence, in this 
paper, we will be concerned only with idempotent unifiers. Comparisons between 
idempotent substitutions can be characterized via congruences. 

Theorem 3. Let a and 9 be idempotent substitutions and V the set of variables 
in the domain or range of a. Then, a <^g 9 if and only if E„ C 

Proof. If a is idempotent then we can prove that <J diEg ^ only ii a9 9. 
Now assuming a <^g 9, we have xa9 x9 for all x G V. But x9 ^{Eb)s x 

and E9 C ^*^g^J^g by Proposition 1. Therefore, it follows that C ^*^g^J^g■ 
But again by Proposition 1, xa9 ^*^g xa and therefore, Ea- C ^*^g^^g- 

Conversely, if E,j C ^*^g^J^g then, E,j9 C ^*Ee)gu(Ege)si since the 
equations in Eg9 are all trivial equations of the form u « u, it follows that 
Ecr9 C which implies a9 O^g 9 and hence a <^g 9. 

3 Rigid ^^-Unification 

We next present a set of abstract transformation (or transition) rules that can 
be used to describe a variety of rigid A-unification procedures. By Theorem 2, 
the problem of finding a rigid if-unifier of two terms s and t amounts to finding 
a substitution-feasible set S such that s ^*evS)s involves (1) construct- 

ing a substitution-feasible set S, and (2) verifying that s and t are congruent 
modulo if U S'. Part (1), as we shall see, can be achieved by using syntactic unifi- 
cation, narrowing and superposition. Efficient techniques for congruence testing 
via abstract congruence closure can be applied to part (2). 

Abstract Congruence Closure 

A term t is in normal form with respect to a rewrite system R if there is no 
term t' such that t t' . A rewrite system R is said to be (ground) confluent 
if for all (ground) terms t, u and v with ^ t —>^*eV there exists a term w such 
that u — w <— flW. It is terminating if there exists no infinite reduction sequence 
to ^1 ^2 • • • of terms. Rewrite systems that are (ground) confluent and 

terminating are called (ground) convergent. 

Let T be a set of function symbols and variables and AT be a disjoint set 
of constants. An (abstract) congruence closure (with respect to E and K) is a 
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ground convergent rewrite system R over the signature F U such that (i) each 
rule in R is either a D-rule of the form /(ci, . . . , Ck) « cq where / is a fc-ary 
symbol in F and cq, ci, . . . , Cfc are constants in K, or a C-rule of the form cq « ci 
with Co, Cl G K, and (ii) for each constant c € K that is in normal form with 
respect to R, there exists a term t G T{F) such that t c. Furthermore, if E 
is a set of equations (over F U K) and R is such that (iii) for all terms s and t 
in T{F), s t if, and only if, s ° <— ^g t, then R is called an (abstract) 
congruence closure for E. 

For example, let Eq = {gfx « z,fgy « z} and F = {g, f,x,y, z}. The 
set El consisting of the rules a; « ci, y « C 2 , z « C 3 , fc\ « C 4 , yc 4 « 
C 3 , gci « C 5 , fc 5 « C 3 is an abstract congruence closure (with respect to F and 
{ci, . . ., 05 }) for Eq. 

The key idea underlying (abstract) congruence closure is that the constants in 
K serve as names for congruence classes, and equations /(ci , . . . , Cfc) ~ cq define 
relations between congruence classes: a term f{t\, is in the congruence 

class Co if each ti is in the congruence class c,. 

The construction of a congruence closure will be an integral part of our rigid 
if-unification method. We will not list specific transformation rules, but refer the 
reader to the description in [ 2 ] which can be easily adapted to the presentation 
in the current paper. 

For our purposes, transition rules are defined on quintuples (X, E; V, E7; S), 
where 27 is a given fixed signature, R is a set of variables, X is a set of constants 
disjoint from EUV, and E, ElandS are sets of equations. The first two com- 
ponents of the quintuple represent a partially constructed congruence closure, 
whereas the third and fourth components are needed to formalize syntactic uni- 
fication, narrowing and superposition. The substitution-feasible set in the fifth 
component stores an answer substitution in the form of a set of equations. For a 
given state {K, E; V, El] S), we try to find a substitution a with T>om{a) C V, 
that is a rigid if-unifier of each equation in the set El. By an initial state we 
mean a tuple (0, Eq] Vq, {s ~ t}; 0) where Vq is the set of all variables that oc- 
cur in Eq, s or t. Transition rules specify ways in which one quintuple state can 
be transformed into another such state. The goal is to successively transform a 
given initial state to a state in which the fourth component is empty. 



C-Closure: 



{K,E]V,E1;S) 

{K',E']V,E1]S) 



ii K C K' and E' is an abstract congruence closure (with respect to 27 U R and 
K') for E. 

Note that we need not choose any term ordering, which is one of the main 
differences of our approach with most other rigid unification methods. 



^ We treat variables as constants and in this sense speak of a ground convergent system 
R. 
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Syntactic Unification 

C-closure can potentially extend the signature by a set of constants. Thus we 
obtain substitutions (or substitution-feasible sets) and terms over an extended 
signature that need to be translated back to substitutions and terms in the 
original signature, essentially by replacing these constants by terms from the 
original signature. For example, consider the abstract congruence closure Ei 
for Eo = {gfx « 2 , fgy « z} described above, and the substitution- feasible set 
S = {c 3 X, X ^ y}. This set can be transformed by replacing the constant C 3 by 
z to give a substitution- feasible set {z « a;, x « y}. Unfortunately, this may not 
be possible always. For example, in the substitution-feasible set S' = {ci « x}, 
we can’t eliminate the constant ci since x is the only term congruent to ci 
modulo El, but, the resulting set {x i-^- x} is not substitution-feasible. 

We say that a (substitution-feasible) set S = {ti ^ Xi : ti G T{ELIK, V),Xi G 
U 1 < z < n} of rules is E-feasible on V if, there exist a terms Si G T{SUV) with 
Si ti, such that the set S| e= {si « Xj : 1 < z < rz} is substitution- feasible. 

Recall that if ct is a rigid if-unifier of s and t, then there exists a proof 
s ^*EguE<> The transition rules are obtained by analyzing the above hypo- 
thetical proof. The rules attempt to deduce equations in E„ by simplifying the 
above proof. We first consider the special case when s t, and hence s and 

t are syntactically unifiable. Trivial proofs can be deleted. 



Deletion: 



{K, E- V, E?U{ti=s tj; S) 
{K,E-,V,E1-S) 



If E„ is a substitution-feasible set and the top function symbols in s and t 
are identical, then all replacement steps in the proof s *-^*^g t occur inside a 
non-trivial context, and hence this proof can be broken up into simpler proofs. 



Decomposition: 



{K,E-V,Enj{f{ti,...,t^)^ f{si,...,Sr,)}-,S) 
{K, U; U, U? U {ti « si, . . . , « s„}; S) 



if / G U is a function symbol of arity n. 

Finally, if the proof s <-^^9 t is a single replacement step (at the root 
position, and within no contexts), we can eliminate it. 



Elimination: 



{K, E; V, E?U{x^ t}; S) 
(KU{x},EUEe;V-{x}, E?; S U Eg) 



if (i) 0 = {x 1 -^ t}, (ii) the set Eg = {t « x} is E-feasible on V, and (iii) x G U. 

Deletion and decomposition are identical to the transformation rules for syn- 
tactic unification, c.f. [1]. However, elimination (and narrowing and superposition 
described below), do not apply the substitution represented by Eg (or Eg^ e) to 
the sets E? and S as is done in the corresponding standard rules for syntactic 
unification. Instead we add the equations Eg to the second component of the 
state. 
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Decomposition, deletion and elimination can be replaced by a single rule that 
performs full syntactic unification in one step. We chose to spell out the rules 
above as they provide a method to abstractly describe an efficient quadratic-time 
syntactic unification algorithm by recursive descent, c.f. [1]. 



Narrowing and Superposition 

The following rule reflects attempts to identify and eliminate steps in a proof 
s t that use equations in if®. 



Narrowing: 



(K,E;V,E?U{s[n^t};S) 

(K UV',EU Eg] V - V, El U {s[c] « f}; 5 U Eg) 



where (i) I ^ c G E, (ii) 9 is the most general unifier of I' and I, (iii) the set 
Eg is if-feasible on V, (iv) V = 'Dom{6) C V, (v) E is an abstract congruence 
closure with respect to E and K UV, and (vi) either I' ^ V or I G V. 

We may also eliminate certain “proof patterns” involving rules in if® (and 
if®) from the proof s t via superposition of rules in E. 



Superposition: 



{K, E = E'U{t^c, C[t'] » d}; V, El; S) 
{K U y', E'U{tKic}UT;V- V , El] S U Eg) 



if (i) 9 is the most general unifier of t and t', (ii) Eg is if-feasible on V, (iii) 
T = EgU {C[c\ « d}, (iv) V = Vom{9) C V, (v) E is an abstract congruence 
closure with respect to E and K U V, and (vi) either t' ^ V or t G V. 

Narrowing, elimination and superposition add new equations to the second 
component of the state, which are subsequently processed by C-closure. 

We illustrate the transition process by considering the problem of rigidly 
unifying the two terms fx and gy modulo the set Eq = {gfx « 2, fgy « z}. Let 
El denote an abstract congruence closure {x « ci, y « C2, z « C3, fc\ « C4, gc4 « 
C3, gc2 ~ C5, fc5 « C3} for Eq and Ki be the set {ci, . . . , C5} of constants. 



B 


K, 


E, 


y 


Eh 


4 


Rule 


1 




Eo 


{x,y, z} 


{fx~ gy} 




C-Closure 


0 


Ki 


El 


{x,y, z} 


{fx- gy} 




Narrow 


B 


Ki U {x} 


ifi U {x « 05} 


id, 4 


{c3 « gy} 


{05 « x} 


C-Closure 


B 


K2 


Es 


id, 4 


{c3 « gy} 


{05 « x} 


Narrow 


4 


K2 U {5} 


Es^iy- 03} 


{4 


{03 « 03} 


4 u {03 « y}] 


Delete 


B 


K4 


Ea 


{4 




4 





where E3 = {x ci,y « 02,2 « C3,/ci « 03,503 « 03,502 « 01,05 « 01,04 « 
03} is an abstract congruence closure for if 2. Since the set if? 5 is empty, the 
rigid unification process is completed. Any set S'4t is a rigid unifier of fx and 
gy. For instance, we may choose gy for the constant 05 and 2 for 03 to get the 
set 541 £4= {55 « X, z « 5} and the corresponding unifier [x 1— > gy; y z]. 
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Optimizations 

A cautious reader might note that the transition rules contain a high degree of 
non-determinism in the present form. In particular after an initial congruence 
closure step, every variable x in the third component of a state occurs as a left- 
hand side of some rule a; « c in the second component. Consequently, this rule 
can be used for superposition or narrowing with any rule in the second or fourth 
component. A partial solution to this problem is to replace all occurrences of c 
by X in the second component and then delete the rule x ~ c. This is correct 
under certain conditions. 



Compression! : 



{K U {c}, E LI {x c}]V U {a;}, El] S) 
(AT, E0]VL{x},E10]S') 



if (i) 0 is the substitution {c i-^- a;}, (ii) if U {a; « c} is a fully-reduced abstract 
congruence closure (with respect to E and KLV), (iii) c does not occur on the 
right-hand side of any rule in E, and (iv) S' is obtained from S by applying 
substitution 0 only to the left-hand sides of equations in S. 

A related optimization rule is: 



Compression2 : 



{K U |c, d}, A U |c » d}] V, El] S) 
{KL{d},E]V,E10]S') 



if (i) 0 is the substitution {c i— > d}, (ii) if U {c « d} is a fully-reduced abstract 
congruence closure (with respect to E and KLV), and (iii) S' is obtained from 
S by applying substitution 0 only to the left-hand sides of equations in S. 

These two optimization rules can be integrated into the congruence closure 
phase. More specifically, we assume that application of C-closure rule is always 
followed by an exhaustive application of the above compression. We refer to this 
combination as an “Opt-Closure” rule. 

To illustrate these new rules, now note that both x ^ ci and y ~ C2 can 
be eliminated from the abstract congruence closure Ei for Eq. We obtain an 
optimized congruence closure {z « c^, fx « 04,304 « c^,gy ~ C5, fc^ « 03 }. 
Note that we cannot remove 2 « 03 from the above set. 



4 Correctness 

Let U be an infinite set of constants from which new constants are chosen in 
opt-closure. If a state = {Ki, Ei]Vi, El i] Si) is transformed to a state = 
{Kj,Ej] Vj,Elj] Sj) by opt-closure, then (i) Ej is an abstract congruence closure 
(with respect to E and KLV) for Ei and (ii) Ej is contained in a well-founded 
simplification ordering® . 

® For instance, a simple lexicographic path ordering based on a partial precedence 
on symbols in ELU LV for which f >~ c, whenever f & E and c € ULV , and x >~ c, 
whenever x & V and c £ U — V , will suffice. 
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We use the symbol \~reu to denote the one-step transformation relation 
induced by opt-closure, deletion, decomposition, elimination, narrowing and su- 
perposition. A derivation is a sequence of states '^reu ■fi '^reu • • • with no 
two consecutive applications of opt-closure. 

Theorem 4. (Termination) All derivations starting from an initial state (0,- 
Eq; Vbj {s ~ t}'i 0) finite. 

Proof. Define a measure associated with a state {K, E; V, El; S) to be the pair 
(|y|,m_E7), where \V\ denotes the cardinality of the set V and ms? = {{s, t} : 
s « t G E7}. These pairs are compared lexicographically using the greater-than 
relation on the integers in the first component and the two- fold multiset extension 
of the ordering in the second component. This induces a well-founded ordering 
on states with respect to which each transition rule is reducing. 



Lemma 1. Let {Kn, En]Vn, El n', Sn) he the final state of a derivation from 
(0, Eq; Vo, El o', 0), where Eo U Elo are equations over T(A, Vo). Then 
(a) the set Sn is En-feasible on Vo and 
(h) if Eln C then Elo V 

Theorems (Soundness). If {Kn, En,Vn, El n, Sn) is the final state of a 
derivation from (0, Aq; ho, A?o; 0), then the set Sn is En-feasible and the sub- 
stitution corresponding to (any) set Sn) e„ is a rigid Eo-unifier of s and t. 

Proof. The A„-feasibility of Sn on Vo follows from Lemma 1. Since Eln = 0, the 
antecedent of the implication in part (b) of Lemma 1 is vacuously satisfied and 
hence A?o C 

Note that by Theorem 1, the ground congruence induced by Sn) e„ is iden- 
tical to the congruence induced by Ea-, where a is the idempotent substitution 
corresponding to the set Sn)E„- Hence, s t. Using Theorem 2, we 

establish that ct is a rigid Uo-unifier of s and t. 



Theorem 6 (Completeness). Let 6 he an idempotent rigid Eo-unifier of s and 
t and Vo the set of variables in Eo, s and t. Then, there exists a (finite) derivation 
with initial state (0, Eo', Vo, {s « t}', 0) and final state {Kn, En, U„, 0; Sn) where 

^ ^E(UE^' 

Proof. (Sketch) Let fi = {Ki, Ep, Vi, Elp Si) be a state. We say a substitution- 
feasible set S'* is a solution for state fi if S* is a Unfeasible on Vi and 

— ^(EuSiUS‘)^' 

Now, given a state fi and a solution S* for fi, we show how to obtain a new 
state fj and a solution S^ for such that the pair (fj, S^) is smaller m a certain 
well-founded ordering than the pair {£,i,S'^) and the congruences induced by 
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Ej U Sj U and EiU SiU are identical. The well-founded ordering will be 
a lexicographic combination of the ordering on states ^i’s used in the proof of 
Theorem 4 and a well-founded ordering on substitution-feasible sets Si’s. If a 
pair (^1,5®) can not be reduced then we show that Eli = 0- This yields the 
desired conclusion. 

The above reduction of a pair (^j, 5®) can be achieved in two ways: (i) by an 
REU transformation on suitably guided by the given solution S'®, or, (ii) by 
some simple transformation of the set S®. The latter transformation rules are 
defined in the context of the state ^i. The initial state is (S®, 0). 



R1 : 


(£»' U {c « 
(D',C'Ul 


x},C) 
c« a;}) 


if C G RTi U Vi, X 


R2 : 


(E'U{c^x},C') 


if c G Ki U Vi, X ^ 9 ^ 


R3 : 


(D'UW] 

P'U{t[c] 


«4,C") 

«4,C") 


if ^ « c G Ei, 1 1' 


R4 : 


{D'um 

{D^U{t[y] 


«4,C") 

«4,C") 


Vl^yeD^l ^*c>9 1', l^KiUVi 



These rule construct a generalized congruence closure for the initial set D'uC 
(modulo the congruence induced by Ei). If {D' ,C) can be obtained from (S®, 0) 
by repeated application of these rules, then the set D' U C is (i) substitution- 
feasible, (ii) Ei-feasible with respect to Vi, and (iii) equivalent modulo Ef on Vi 
to S®. The set of rules R1 R4 is terminating. 

5 Specialization to Syntactic Unification 

Since rigid unification reduces to syntactic unification when the set Eq is empty, 
one pertinent question is what procedure the REU transformation rules yield 
in this special case? Note that elimination does not apply a substitution to the 
fourth and fifth components of the state, but does perform an occur check in 
condition (ii). This is in the spirit of syntactic unification by recursive descent 
algorithm which works on term directed acyclic graphs and is a quadratic time- 
complexity algorithm. 

In fact, in the case of syntactic unification, every equation in the second 
component is of a special form where one side is always a variable. Hence, we can 
argue that for each c & K, there is at most one rule in E of the form /(. . .) ^ c 
where / € U. We may therefore replace superposition by the following rule: 

(K, E- V, E? U {c « t}; S) 

Det-Decompose: 

(K,E;V,E?U{f(ci,...,Ck)-t};S) 

if there exist exactly one rule /(ci , . . . , Cfc) « c with right-hand side c in E. 
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In addition, we may restrict narrowing so that the unifier 6 is always the 
identity substitution, that is, narrowing is used to only for simplification of 
terms in the fourth component El by equations in the second component E. 

We can get various efficient syntactic unification algorithms by using spe- 
cific strategies over our abstract description. Other descriptions of the quadratic 
time syntactic unification algorithms are usually based on descriptions of dags 
and abstract rules that manipulate the dags directly. However, since we can ab- 
stractly capture the notion of sharing, we obtain rules for this class of efficient 
algorithms that work on terms and are very similar to the rules for describing the 
naive syntactic unification procedures (with a worst case exponential behavior) . 

6 Summary 

We have presented a formulation of rigid if-unification in terms of fairly abstract 
transformation rules. The main feature is the integration of (abstract) congru- 
ence closure with transformation rules for syntactic unification, paramodulation 
and superposition. The use of an extended signature (inherent in abstract con- 
gruence closure) helps to dispense with term orderings over the original signa- 
ture. An abstract rule-based description facilitates various optimizations. The 
specialization of the transformation rules to syntactic unification yields a set of 
abstract transition rules that describe a class of efficient syntactic unification 
algorithms. Our transformation rules can be derived from proof simplification 
arguments. 

In [10], a congruence closure algorithm is used in a rigid A-unification pro- 
cedure, but not as a submodule. Congruence closure is used “indirectly” to do 
ground completion. The work on abstract congruence closure shows that congru- 
ence closure actually is ground completion with extension. But for the purpose 
of rigid if-unification, we don’t need to translate the abstract closure to a ground 
system over the original signature, though we do need to translate the substi- 
tutions back to the original signature. Extended signatures also help as we do 
not need to guess an ordering to orient equations such as a; « /a when the 
substitution for x is not yet known. This is a major concern in [10] where the 
dependence on orderings complicates the unification process. 

In [5], the problem of correctly orienting equations is solved by delaying 
the choice of orientation and maintaining constraints. Constraint satisfiability is 
required to ensure that orientations are chosen in a consistent manner, and to 
guarantee the termination of such a procedure. 

We would like to point out that the transformation process involves “don’t- 
care” non-determinism (where it does not matter which rule one applies) and 
“don’t-know” non-determinism (where an application of a wrong rule may lead 
to a failure even if a unifier exists) . Whereas opt-closure, deletion and narrowing 
with identity substitution can be applied “don’t-care” non-deterministically, the 
other rules have to be applied in a “don’t-know” non-deterministic manner. 
The rules for syntactic unification described in Section 5 are “don’t-care” non- 
deterministic. 
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All algorithms for computing the set of rigid unifiers for a pair of terms can 
be seen as a combination of top-down and bottom-up method. In a pure bottom- 
up approach a substitution is guessed non-deterministically: for every variable 
one tries every subterm that occurs in the given unification problem, see [13] 
for details. Superposition and narrowing using a rule that contains a variable as 
its left-hand side captures the bottom-up aspect in our description. A top-down 
approach is characterized by the use of narrowing to simplify the terms in the 
goal equations El. 

We note that for variables that cannot be eliminated from the left-hand sides 
of rules using compression!, we need to try a lot of possible substitutions because 
they can unify with almost all subterms in the second and fourth components. 
This is the cause of a bottom-up computation for these variables. For other 
variables, however, we need to try only those substitutions that are produced by 
some unifier during an application of narrowing or superposition, and hence a 
top-down approach works for these variables. 

We illustrate some of the above observations via an example. Let Eq = {gx ~ 
X, X a}, and suppose we wish to find a rigid Ao-unifier of the terms gfffgffx 
and fffx. The substitution {x i-^- fa} is a rigid A-unifier, but it cannot be 
obtained unless one unifies the variable x with an appropriate subterm. 

We believe that our approach of describing rigid A-unification can be used 
to obtain an easier proof of the fact that this problem is in NP. We need to show 
that (i) the length of a maximal derivation from any initial state is bounded 
by some polynomial in the input size, (ii) each rule can be efficiently applied, 
and (iii) there are not too many choices between the rules to get the next step 
in a derivation. It is easy to see that (i) holds. For the second part, a crucial 
argument involves showing that the test for A-feasibility can be efficiently done. 
This is indeed the case, but due to space limitations, we don’t give a way to do 
this here. 

The notion of an abstract congruence closure is easily extended to handle 
associative and commutative functions [3]. The use of extended signatures is 
particularly useful when one incorporates such theories. This leads us to believe 
that our proposed description of rigid A-unification can be suitably generalized 
to such applications. 
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Abstract. Model checking and theorem proving have largely comple- 
mentary strengths and weaknesses. Thus, a research goal for many years 
has been to find effective and practical ways of combining these ap- 
proaches. However, this goal has been much harder to reach than origi- 
nally anticipated, and several false starts have been reported in the liter- 
ature. In fact, some researchers have gone so far as to question whether 
there even exists an application domain in which such a hybrid solu- 
tion is needed. In this talk I will argue that formal verification of the 
floating-point circuits of modern high-performance microprocessors is 
such a domain. In particular, when a correctness statement linking the 
actual low-level (gate-level) implementation with abstract floating-point 
numbers is needed, a combined model checking and theorem proving 
based approach is essential. To substantiate the claim, I will draw from 
data we have collected during the verification of the floating point units 
of several generations of Intel microprocessors. In addition, I will discuss 
the in-house formal verification environment we have created that has 
enabled this effort with an emphasis on how model checking and theorem 
proving have been integrated without sacrificing usability. 
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Abstract. The Parameterized Model Checking Problem (PMCP) is to 
determine whether a temporal property is true for every size instance of 
a system comprised of many homogenous processes. Unfortunately, it is 
undecidable in general. We are able to establish, nonetheless, decidabil- 
ity of the PMCP in quite a broad framework. We consider asynchronous 
systems comprised of an arbitrary number of homogeneous copies of a 
generic process template. The process template is represented as a syn- 
chronization skeleton while correctness properties are expressed using 
Indexed CTL*\X. We reduce model checking for systems of arbitrary 
size n to model checking for systems of size up to (of) a small cutoff size 
c. This establishes decidability of PMCP as it is only necessary to model 
check a finite number of relatively small systems. Efficient decidability 
can be obtained in some cases. The results generalize to systems com- 
prised of multiple heterogeneous classes of processes, where each class is 
instantiated by many homogenous copies of the class template (e.g., m 
readers and n writers). 



1 Introduction 

Systems with an arbitrary number of homogeneous processes can be used to 
model many important applications. These include classical problems such as 
mutual exclusion, readers and writers, as well as protocols for cache coherence 
and data communication among others. It is often the case that correctness prop- 
erties are expected to hold irrespective of the size of the system, as measured 
by the number of processes in it. However, time and space constraints permit 
us to verify correctness only for instances with a small number of processes. 
This makes it impossible to guarantee correctness in general and thus motivates 
consideration of automated methods to permit verification for system instances 
of arbitrary sizes. The general problem, known in the literature as the Param- 
eterized Model Checking Problem (PMCP) is the following: to decide whether a 
temporal property is true for every size instance of a given system. This problem 
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is known to be undecidable in general [1]. However, by imposing certain stipula- 
tions on the organization of the processess we can get a useful framework with 
a decidable PMCP. 

We establish our results in the synchronization skeleton framework. Our re- 
sults apply to systems comprised of multiple heterogeneous classes of processes 
with many homogeneous process instances in each class. Thus, given family 
(C/i, ..., C/fc) of k process classes, and tuple of natural numbers, we 

let (C/i, ..., denote the concrete system composed of ni copies or 

instances of C/i through Uk copies or instances of Uk running in parallel asyn- 
chronously (i.e., with interleaving semantics). By abuse of notation, we also write 
{Ui, C/fc) ("1 ’■■■>”'=) for the associated state graph, where each process starts in 
its designated initial state. 

Correctness properties are expressed using a fragment of Indexed CTL*\X. 
The basic assertions are of the form “for all processes Ah” , or “for all processes 
Eh”, where h is an LTL\X formula (built using F “sometimes”, G “always”, 
U, “until”, but without X “next-time”) over propositions indexed just by the 
processes being quantified over, and A “for all futures” and E “for some future” 
are the usual path quantifiers. Use of such an indexed, stuttering-insensitive logic 
is natural for parameterized systems. 

We consider correctness properties of the following types: 

1. Over all individual processes of single class Uf. 

Ah(b) and Eh(q) , where q ranges over (indices of) individual pro- 
cesses in Ui- 

2. Over pairs of different processes of a single class Uf. 

and ^Kihji)i where ii,ji range over pairs of distinct 

processes in C//. 

3. Over one process from each of two different classes Ui, Um- 

and ^Hiijrn), where ii ranges over Ui and jm ranges 

over Um- 

We say that the fc-tuple (ci, ..., Cfc) of natural numbers is a cutoff oi {Ui, Uk) 
for formula / iff : V(m, ..., rifc), (C/i, ..., C/fc)("i’-"’"'=) ^ / iff V(mi , ..., mfc) ^ 
(ci,...,Cfc) : (C/i, ..., C/fc)(™i’---’™'=) )= /, where we write (mi,..., mfc) ^ (ci,...,Cfc) 
to mean (mi,..., mfc) is component- wise less than or equal to (ci,...,Cfc) and 
(mi, ...,mfc) ^ (ci, ...,Cfc) to mean (ci, ...,Cfc) ^ (mi, ...,mfc). 

In this paper, we show that for systems in the synchronization skeleton frame- 
work with transition guards of a particular disjunctive or conjunctive form, there 
is a small cutoff. This, in effect, reduces PMCP to ordinary model checking over 
a relatively few small, finite sized systems. In some cases, depending on the kind 
of property and guards, we can get an efficient (quadratic in the size of the 
template processes) solution to PMCP. 

Each process class is described by a generic process, a process template for 
the class. A system with k classes is given by templates (Ui , ..., Ufc). For such 
a system, define c, = |f7i| -F 3 and di = 2\Ui\ -F 1, where \Ui\ is the size i.e. the 
number of local states of template Ui- Then, for both conjunctive and disjunctive 
guards, cutoffs of (di, ..., dk) and (ci, ..., Cfc) respectively suffice for all three types 
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of formulae described above. These results give decision procedures for PMCP 
for conjunctive or disjunctive guards. Since these are a broad framework and 
PMCP is undecidable in general, we view this as quite a positive result. However, 
the decision procedures are not necessarily efficient ones, although they may 
certainly be usable on small examples. Because the cutoff is proportional to the 
sizes of the template processes, the global state graph of the cutoff system is 
of size exponential in the template sizes, and the decision procedures are also 
exponential. In the case of disjunctive guards, if we restrict ourselves to the A 
path quantifiers, but still permit all three type of properties, then the cutoff can 
be reduced, in quadratic time in the size of the template processes, to something 
of the form (1, ..., 2, ..., 1) or (1, ..., 3, ..., 1). In fact, depending on the type of 
property, we can show that it is possible to simplify the guards to ensure that 
only two or three classes need be retained. On the other hand, for conjunctive 
guards, if we restrict ourselves to model checking over infinite paths or over finite 
paths, then sharper cutoffs of the form (1,...,3,...,1), (1,...,2,...,1) or even (1,...,1) 
can, in some cases, be obtained. 

The rest of the paper is organized as follows. Section 2 defines the system 
model. Section 3 describes how to exploit the symmetry inherent in the model 
and correctness properties. Sections 4 and 5 prove the results pertaining to dis- 
junctive and conjunctive guards respectively. We show some applications of our 
results in Section 6. In the concluding Section 7, we discuss related work. 

2 The System Model 

We focus on systems comprised of multiple heterogeneous classes of processes 
modelled as synchronization skeletons{ci. [2]) . Here, an individual concrete pro- 
cess has a transition of the form I m indicating that the process can transit 
from local state I to local state m, provided the guard g is true. Each class is spec- 
ified by giving a generic process template. If I is (an) index set {I, . . . , n}, then 
we use , or for short, to denote the concurrent system C/^|| . . . ||{ 7 ” com- 

prised of the n isomorphic (up to re-indexing) processes C/* running in parallel 
asynchronously. For a system with k classes associated with the given templates 
Ui, U 2 , ..., Uk, we have corresponding (disjoint) index sets Iijh, ■ ■ -Ik- Each in- 
dex set Ij is (a copy of) an interval {!,..., c} of natural numbers, denoted 
{Ij, ... ,Uj} for emphasis^. In practice, we assume the k index sets are specified 
by giving a fc-tuple (rii, ..., Uk) of natural numbers, corresponding to I\ being (a 
copy of) interval {1, . . .n\} through Ik being (a copy of) interval {1, . . . , Uk}. 

Given family (C/i, ..., C/fc) of k template processes, and a fc-tuple (ni,...,rifc) 
of natural numbers, we let {Ui , ..., denote the concrete system com- 

posed on ni copies of U\ through Uk copies of Uk running in parallel asyn- 
chronously (i.e., with interleaving semantics). A template process Ui = {Si,Ri, i/) 
for class I, is comprised of a finite set Si of (local) states, a set of transition 

^ e.g., if 7i is a copy of {1,2,3}, the said copy is denoted {li,2i,3i}. Informally, 
subscripted index 3i means process 3 of class 1; formally, it is the ordered pair (3, 1) 
as is usual with indexed logics 
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edges Ri, and an initial(local) state i/. Each transition Ri is labelled with a 
guard, a boolean expression over atomic propositions corresponding to local 
states of other template processes. Then given index i and template process Ui, 
Ul = {Sl,R}, ij) is used to denote the zth copy of the template process Ui. Here 
SI, the state set of Ul, R\ its transition relation and i; its initial state are ob- 
tained from Si, Ri and i/ respectively by uniformly superscripting the states of 
Ul with i. Thus, for local states s/,t/ of Si, s],t] denote local states of Uf and 
{si,ti)&Ri iff {sl,ti)eRi. 

Given guards of transitions in the template process, we now describe how to 
get the corresponding guards for the concrete process Ul of 
{U\, ..., In this paper, we consider the following two types of guards. 

i) Disjunctive guards - of the general form (oi V ... V b\) \J ... \J{ak V ... V bk), 

where the various ai,...,bi are (propositions identified with the) local states of 
template U, label each transition (s/,t/) G Ri. In concrete process Ul of the 
system (C/i, ..., " the corresponding transition G R\ is then 

labelled by the guard 

y V ... V 6 [) V Vj>i/(Vfc 6 [i..nj] ■■■ ^j))> 

where proposition is understood to be true when process k in class Uj i.e. U^ 
is in local state aj for template process Uj. 

ii) Conjunctive guards with initial state - of the general form (ii V oi V ... V 

bi) A ••• A(ifc V Ofc V ... V bk). In concrete process i of class I, Ul, in the system 
{Ul, ..., " the corresponding transition is then labelled by the guard 

V a[ V ... V bl) A A,^/(Afce[i..„,] (i," V a<; V ... V b<;)). 



Note that the initial local states of processes must be present in these guards. 
Thus, the inital state of a process has a “neutral” character so that when process 
j is in its initial state, it does not prevent progress by another process i. This 
natural condition permits modelling a broad range of applications (and is helpful 
technically). 

We now formalize the asynchronous concurrent (interleaving) semantics. A 
process transition with guard g is enabled in global state s iff s ^ g i.e., g is 
true over the local states in s. A transition can be fired in global state s iff its 
guard 5 is enabled. Let (Gi , ..., Gfc)^"^’-’""^ = (5'("i--"'=), 

be the global state transition graph of the system instance (m, ri2, ..., Ufc). A 
state s G S'!"! ■■■■■"'=) is written as a (ni -I- ... -I- nfc)-tuple {u\, ..., U2, ..., u^'“) 

where the projection of s onto process i of class I, denoted s{l,i), equals u\, 
the local state of the zth copy of the template process Ui. The initial state 
j(m, ■■■,"»=) _ ^ global transition (s, t) G ■■■■>"'=) iff t results from s 

by firing an enabled transition of some process i.e., there exist z, I such that the 
guard labelling {u\,v\) G R\ is enabled at s,s{l,i) = u\,t{l,i) = v\, and for all 
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(jj k) ^ (i, 1), s(k,j) = t(k,j). We write (C/i , \= f to indicate that 
the global state graph of (17i, satisfies / at initial state j("i. ■■■."»=). 

Finally, for global state s, define Set{s) = {t | s contains an indexed local copy 
oft }. For computation path x = xq, xi, ... we define PathSet(x) = IJ^ Set{xi) . 
We say that the sequence of global states y = yo^yi, ... is a, stuttering of compu- 
tation path X iff there exists a parsing PqPi... of y such that for all j > 0 there is 
some r > 0 with Pj = (a;j)’’(cf. [3]). Also, we extend the definition of projection 
to include computation sequences as follows: for i G the sequence of local 

states xq{1^ i), xi(l, i), ... is denoted by x(l, i). 



3 Appeal to Symmetry 

We can exploit symmetry inherent in the system model and the properties 
in the spirit of “state symmetry” codified by [8](cf. [16], [12]) to simplify our 
proof obligation. To establish formulae of types /\i^ f{ii), f{ii,ji) and 

f\i^ j^f{ii,jm), it suffices to show the results with the formulae replaced by 
/(I/), /(I/, 2/) and /(l/,lm), respectively. The basic idea is that in a system 
comprised of fully interchangeable processes 1 through n of a given class, sym- 
metry considerations dictate that process 1 satisfies a property iff each process 
z G [l..n] satisfies the property. Proofs are omitted for the sake of brevity. 



4 Systems with Disjunctive Guards 

In this section, we show how to reduce the PMCP for systems with disjunctive 
guards, to model checking systems of sizes bounded by a small cutoff, where the 
size of the cutoff for each process class is essentially the number of local states of 
individual process template for the class. This yields decidability for this formu- 
lation of PMCP, a pleasant result since PMCP is undecidable in full generality. 
But this result, by itself, does not give us an efficient decision procedure for the 
PMCP at hand. We go on to show that in the case of universal-path-quantified 
specification formulae (A/z), efficient decidability can be obtained. 



4.1 Properties Ranging over All Processes in a Single Class 

We will first establish the 

Theorem 4.1.1 (Disjunctive Cutoff Theorem). 

Let f he ^h{ii) or Eh{ii), where h is an LTL\X formula and I G [l..fc]. 
Then we have the following 

V(m,...,nfc) ^ (!,...,!) : (C/i, . . . , ^ f iff 

V(di, . . . , dfc) ^ (ci, . . . , Cfc) : {Ui,..., [/j,)(<ii.---.<ifc) ^ 
where the cutoff (ci , . . . , Cfc) is given by ci = 1 17/ 1 + 2, and for i ^ I : Ci = \Ui\ + l. 



As a corollary, we will have the 
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Theorem 4.1.2 (Disjunctive Decidability Theorem). PMCP for systems 
with disjunctive guards and single-index assertions as above is decidable in ex- 
ponential time. 

Proof idea 

By the Disjunctive Cutoff Theorem, it is enough to model check each of the 
exponentially many exponential size state graphs corresponding to systems 
( 17 i, . . for all {di,...,dk) ^ (ci,...,Cfc). □ 

For notational brevity, we establish the above results for systems with just 
two process classes. We begin by proving the following lemmas. 

Lemma 4.1.1 (Disjunctive Monotonicity Lemma). 

(i) Vn > 1 : (Pi,P2)^^’"^ h E/r(l2) implies (Pi, ^2)^^’"+^^ h E/i(l2). 

(ii) Vn> 1 : (Pi,P2)^^’"^ h E/i(li) tmpUes (Pi, ^2)^^’"+^^ h E/i(li). 

Proof idea 

(i) The idea is that for any computation x of (Pi,P2)^^’"\ there exists an 
analogous computation y of (Pi, P2)(^’”+^) wherein the (n+l)st copy of template 
process P2 stutters in its initial state and the rest of the processes behave as in 

X. 

(ii) This part follows by using a similar argument. □ 

The following lemma allows reduction in system size, one coordinate at a 
time. 

Lemma 4.1.2 (Disjunctive Bounding Lemma). 

(i) Vn> IP2I+2 : (Pi,P2)d'") hE/i(l2) (Pi, P2)(^’"^) h EMI2), where 
C2 = IP2I + 2. 

(ii) Vn > IP2I + 1 : (Pi, P2)(i’") h= Eh(li) tff (Pi, P2)('’""^l+') h EMli)- 

Proof 

(i) (=^) Let X = xo,xi,... denote a computation sequence of (Pi, P2)^^’"^. 
Define Reach = {si, ..., s^} to be the set of all local states of template process 
P2 occuring in x. For st € Reach, let t\,t2, ■■■,tm be a finite local computation 
of minimal length in x ending in st- Then we use MinLength{st) to denote 
m and M inC omputation{st) to denote the sequence ti, t2, ■■■, tm-i, (tm)‘^- Let 
V = (12)“- If X is an infinite computation sequence and a;(l, 1 ) and x{ 2 , 1 ) are 
finite local computations, then there exists an infinite local computation sequence 
u, say. In that case, reset v = u. 

Construct a formal sequence y = yo, yi, ... of global states of (Pi, P2)(^’ILil+i) 
from X as follows 

1 . y{l, 1 ) = s(l, 1 ) and y( 2 , 1 ) = x( 2 , 1 ) i.e. the local computation paths in x 
of process index 1 of classes Pi , P2 are preserved, and 
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2. For each state Sj G Reach, we set y(2,j + 1) = MinComputation{sj) i.e. 
we let the (j + l)st copy of V 2 perform a local computation of minimum length 
in X leading to Sj and then let it stutter in Sj forever. The above condition has 
the implication that for all z > 1, Set{xi) C Set{yi). To see this, let t G Set{xi). 
Then, MinLength{t) < i. Also, t G Set{xi) implies that t G Reach i.e. t = Sq 
for some q G [l-.r]. Then y{2,q + 1) stutters in Sq for all k > MinLength{st) 
and therefore for all k>i, also. Hence yi{2, g + 1) is an indexed copy of Sq, i.e. 
t G Set{yi). Thus for all z > 1, Set{xi) C Set{yi). 

3. y{2, C 2 ) = V. This ensures that if x is an infinite computation sequence 
then in y infinitely many local transitions are fired. 

However, it might be the case that sequence y violates the interleaving seman- 
tics requirement. Clearly, this happens iff the following scenario occurs. Let states 
Sp,Sq G Reach, be such that MinComputation{sp) and MinComputation{sq) 
are realized by the same local computation of x and suppose that MinLength{sp) 
< MinLength{sq) . Then, if for z < MinLength{sp), is a transition in 

MinComputation{sp), {yi{2,p+l),yi+i{2,p+l)) and {yi{2,q + l),yi+i{2,q + l)) 
are both local transitions driving yi to yi+i- This violates the interleaving se- 
mantics condition requiring that there be atmost one local transition driving 
each global transition. There are two things to note here. First, for a transition 
{yi, yi+i), the violation occurs only for values of z < maxj^[i,,r] MinLength{sj) 
and secondly, for a fixed z, all violations are caused by a unique template tran- 
sition (s, t) of V2) namely one which was involved in the transition {xi, Xi+i). 

To solve this problem, we construct a sequence of states w = wq,w\,... 
from y by “staggering” copies of the same local transition as described be- 
low. Let (z/i, z/i_|_i) be a transition where the interleaving semantics require- 
ment is violated by process indices ini, ...,ind of V2 executing indexed copies 
(s 2 "b ■■■) (* 2 ”'^) ^ 2 ”“^) respectively of the template transition ( 52 ,^ 2 ) of V 2 - 

Replace {yi, z/z+i) with a sequence u\, U 2 , ■■■, Uf such that u\ = yi, Uf = yi+\ and 
for all j, transition {uj,Uj+\) results by executing local transition {si^^t™^). 
Clearly the interleaving semantics requirement is met as atmost one local transi- 
tion is executed for each global transition. Also, it is not hard to see that for all j, 
Set{yi) C Set{uj) and hence for all k, transition {uk, zzfc+i) is valid. Finally, note 
that states with indices other than ini, ■■■, zzi^ are made to stutter finitely often 
in ui, ...,Uf which is allowed since we are considering only formulae without the 
next-time operator X. 

Thus, given a computation path x of (Ci,V2)^^’”\ we have constructed a 
stuttering computation path w of {Vi, such that the local computation 

sequence w{2, 1) is a stuttering of the local computation sequence x{2, 1). From 
this path correspondence, we easily have the result. 

(■<=) The proof follows by repeated application of the Disjunctive Monotonic- 
ity Lemma. 



(ii) This part follows by using a similar argument. 



□ 
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The following lemma allows reduction in system size over multiple coordi- 
nates simultaneously (2 coordinates for notational brevity) . 

Lemma 4.1.3 (Disjunctive Truncation Lemma). 

Vni,ri2 > 1 : (C/i , C/2)("i>"=^) h tff h where 

n'2 = min{n2, IC/2I + 2 ) and = min{ni, \ Ui \ + 1 ). 

Proof 

If ri2 > IC/2I + 2 , set Vi = and ^2 = U2- Then, (C/i, C/2)("i-"=^) h iff 

(hi, ^2)^^’"^^ h Eh(l2) iff (Vi, V2)i^’"^^ \= Eh(l2) (by the Disjunctive Bounding 
Lemma) iff (C/i , 1/2)^"^’"^^ h= Eh(l2). 

If m < |C/i|-|-l, then m = n'l and we are done, else set Vi = and V2 = Ui. 
Then, (C/i,C/2)("i-"2) ^ EhiU) iff {U2,UiY<wi) ^ Eh(li) iff (^1,^2)^^’"^^ h 
Eh(li) iff 1 = Eh(li) (by the Disjunctive Bounding Lemma) iff 

(C/l,C/2)i"'i’"^^ h E/t(l2). □ 

An easy but important consequence of the Disjunctive Truncation Lemma is 
the following 

Theorem 4.1.3 (Disjunctive Cutoff Result). 

Let f be ^h{ii) or Eh(q), where h is a LTL\X formula and I G [L- 2 ]. 
Then we have the following 

V(m,n2) ^ (1,1) : (C/i,C/2)i"^’"^i h/ # 

V(di,d2) ^ (ci,C2) : (C/i,C/2)(‘^i-‘^^)h / 
where the cutoff (ci, C2) is given by ci = |Cf/| -h 2 , and for i I : Ci = \Ui\ + 1 . 

Proof 

By appeal to symmetry and the fact that A and E are duals, it suffices to prove 
the result for formulae of the type Eh(l2). The (= 1 >) direction is trivial. For the 
(4=) direction, let ni,U2 > 1 - Define = min{ni, |{ 7 i|-|-l), n'2 = min{n2, IL2I + 
2 ). Then, (C/i, C/2)^”^’”^^ \= /(I2) iff (C/i, C/2)^”^’”^^ \= /(I2) by the Disjunctive 
Truncation Lemma. The latter is true since (u(,n2) ^ (ci,C2). This proves the 
cutoff result. □ 

The earlier-stated Cutoff Theorem re-articulates the above Cutoff Result 
more generally for systems with fc > 1 , different classes of processes; since its 
proof is along similar lines but is notationally more complex, we omit it for the 
sake of brevity. 



4.2 Efficient Decidability for “For All Future” Properties 

It can be shown that for “for some future” properties, corresponding to formulae 
of the type /\ Eh, the reduction entailed in the previous result is, in general, the 
best possible. We omit the proof for the sake of brevity. 
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However, for universal-path-quantified properties, it is possible to be much 
more efficient. We will establish the 

Theorem 4.2.1 (Reduction Theorem). Define V = Ui if for some I G [l-.fc], 
the transition graph for Ui has a nontrivial strongly connected component else 
set V = U[. Then, (Ui , h A., A/i(q) iff h 

where ci = |17/| + 2, Cj = \Ui\ + 1 for i I and Ui is the simplified process that 
we get from Ui by the reduction technique described below. 

This makes precise our claim that for formulae of the type Ai, ^h{ii), it is 
possible to give efficient decision procedures for the PMCP at hand, by reducing 
it to model checking systems consisting of two or three template processes. 

To this end, we first prove the following lemma which states that the PMCP 
problem for the above mentioned properties reduces to model checking just the 
single system instance of size equal to the (small) cutoff (as opposed to all sys- 
tems of size less than or equal to the cutoff). 

Lemma 4.2.1 (Single-Cutoff Lemma). 

Vni,ri2 > 1 : (Ci, ^ A/i(l2) iff h A/i(l2), where ci = 

\Ui\ + l and C2 = |C2|-k2. 

Proof 

(=t«) This direction follows easily by instantiating n\ = ci and = C2 on 
the left hand side. 

(4=) Choose arbitrary k\, k2 > 1. Set k[ = minfki, ci) and = min{k2, C2). 
Then, (C/i, 1/2)^^^’^^^ \= Eft,(l2) iff (17i, 172)^^i’^2) ^ E/i(l2) (by the Disjunc- 
tive Truncation Lemma) which implies (17i, 1/2)^'^^’°^^ \= Eh{l2) (by repeated 
application of the Disjunctive Monotonicity Lemma). Now, by contraposition, 
(Cl, C/2)^'^^’'^^^ \= Ah{l2) implies (Ci, C/2)^^^’^^^ [= Ah{l2). Since fci, ^2 were arbi- 
trarily chosen, the proof is complete. □ 

Next, we transform the given template processes and follow that up with lem- 
mas giving the soundness and completeness proofs for the transformation. Given 
template processes Ui,...,Uk, define ReachableStates{U\, ...,Uk) = {S^, Sf), 
where S^ = {t\t G Si, such that for some n\, U2, ■■■, Uk > 1, there exists a com- 
putation path of ({7i, ..., leading to a global state that contains a 

local indexed copy oft}. Vj > 0, G [l..fc], we define Pf as follows: 

= Pf IJIA : G Pi :3p p' G Ri and expression g contains a 

state in (J^ P/}. 

For I G [l..fc], define Pi = (J^ Pj^ . Then we have the 
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Lemma 4.2.2 (Soundness Lemma). Given j, for all I G define ai = 

\Pf\. Then, there exists a finite computation sequence x = xq, xi, Xm of 
(17i , such that Wl G : Vs/ G Pf : (3p G : Xm(l,p) = 



Proof 

The proof is by induction on j. The base case, j = 0, is vacuously true. Assume 
that the result holds for j < u and let y = yo,yi, ...,ythe a, computation sequence 
of (C/i, where r/ = |P“|, with the property that Wl G Vs/ G 

Pf- : (3p G : Xm{l,p) = sf). 

Now, assume that ^ P“, and let s/ G \ P“- Furthermore, let 

(sj, si) be the transition that led to the inclusion of s/ into Clearly, sj G P/. 

Then, by the induction hypothesis, 3q G [l.-c] : yt{l, q) is an indexed copy of sj. 
Consider the sequence if = j/q, y(, ..., j/ 2 t+i of states of {Ui, 

where for i G : c G : y'{i,c) = y{i, c){yt{i, and y'{l,ri + 1) = 

, where z is y{l, q)s['‘'^^ with the index q replaced by r/ + l. It can be seen 
that y' is a valid stuttering computation path of {Ui , ..., where 

y' 2 t+i has the property that G : Vs/ G P“ : 3p G [l.-c] : y 2 t+i{l,p) = sf 
and y' 2 t+i{hi"i + 1) = . Repeating the above procedure for all states in 

p«+i pu^ computation path with the desired property. This completes 

the induction step and proves the lemma. □ 

Lemma 4.2.3 (Completeness Lemma). (Sj^, S/.) = {Pi, Pjf). 

Proof 

By the above lemma, Vz G [l..fc] : Pi C S'-. If possible, suppose that (S^, ..., S^,) yf 
{Pi, Pk). Then, the set D = (Jj(S- — Pf) yf 0. For definiteness, let s/ G 
P Pi S/ . Then by definition of Sj , there exists a finite computation sequence 
X = xo, xi, ..., Xm such that for some z, Xm{l,i) = Let j G [0..m] be the 
smallest index such that Set{xj) P P yf 0. Then, PathSet{xo, ■■■, Xj-i) C |J . Pi 
which implies that there exists a transition (sj, s/) in i?/, with guard g such that 
Xj-i p g. But this implies that for some t, si would be included in Pf i.e. s/ G Pi, 
a contradiction to our assumption that s/ G P. Thus P = 0 and we are done. □ 

We now modify the fc-tuple of template processes (Pi, ..., Uk) to get the k- 
tuple (Pi, ..., Uff), where Pj = {Si, Ri, ii), with (sj, U) G Ri iff guard gi labelling 
{si, ti) in Ui contains an indexed copy of a state in lJi6[i fc] - Furthermore, any 
transition in the new system is labelled with gjj, a universal guard that evalu- 
ates to true irrespective of the current global state of the system. The motivation 
behind these definitions is that since for any ni, U 2 , ■■■, zzfc > 1, no indexed copy 
of states in Si \ S^ is reachable in any computation of (Pi, ..., Pfc)(”i’ -’”'=), we 
can safely delete these states from their respective template process. Also, any 
guard of a template process involving only states in Si\ Si, will then always 
evaluate to false and hence the transition labelled by this guard will never be 
fired. This justifies deleting such transitions from the transition graph of respec- 
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tive template processes. This brings us to the following Reduction Result, which 
by appeal to symmetry yields the Reduction Theorem stated before. 

Theorem 4.2.2 (Reduction Result). Define V = Ui if for some I G [l..fc], 
the transition graph for Ui has a nontrivial strongly connected component else 

setV = Ui- Then, (C/i, ..., ^ A/i(lp) iff (Up, V)^ ’ ^ ^ A/i(li), where 

Cp = |C/p| + 2 and Ci = |C/j| + 1 for if^p. 

Proof 

We show that (U\, H Eft-(lp) iff (Up,V)^ ' ^ |= E/i(li). For defi- 

niteness, let V = U^. 

(=^) Define sequence u = (ir)‘^- If U^. has a nontrivial strongly connected 
component, then there exists an infinite path v, say, in its transition graph. In 
that case, reset u = v. 

Let X = xi,X 2 , ... be a computation sequence of (f7i, ..., Define a 

formal sequence y = yi, j/ 2 , ■■■ as follows. Set y(l, 1) = x(p, 1) and in case x(p, 1) 
is a finite computation sequence of length /, say, set y(2, 1) = (\r)^u else set 

y(2, 1) = (ir)“- To prove that y is a valid computation sequence of (Up, V)^ ' \ 
it suffices to show that all transitions of local path y(l, 1) are valid. This follows 
from the definition of and by noting that all states occuring in x(p, 1) are 
reachable and all transitions in x(p, 1) are labelled by guards whose expressions 
involve a state in Sj and hence they occur in Rp. 

(4=) By the Soundness and Completeness lemmas, it follows that there exists 
a finite computation path u = uq,ui, ...,Um of (f7i, ..., starting 

at such that Vj G [l..fc] : Wqj G Sj : 3t G [l..|17j|] : Um(j,t) = 

Qj. Let X = xo,xi,... be a computation path of (Up,V)^ ' \ Define a formal 
sequence y = yo,yi,--- of states of (U\, as follows. Set y(p,T) = 

((ip)™)a;(l, 1), y(r, 1) = ((i,.)™)a;(2, 1), \/z G [l..|f7p|] : y(p,z+\) = u(p,z), 
Mz G [l..\Ur\] : y(r,z+ 1) = u{r,z), and Vj G [l..fc],j p,r ■.'iz G [l..|Lj-|] : 
y(j,z) = u(j , z)(um(j , z)Y . Note that, yi > m : Set(yi) = (J^ and hence for 
all I > m, all template transitions in R^UR^ are enabled in yi. Thus for all i > m, 
all transitions (yi,yi^i) are valid and hence it follows that y is a stuttering of 
a valid computation path of (Ui , ..., with local path y(l, 1) being a 

stuttering of local path a;(l, 1). The path correspondence gives us the result. □ 



Finally, we get the 

Theorem 4.2.3 (Efficient Decidability Theorem). For systems with dis- 
juctive guards and properties of the type /\j^ Ah(ii), the PMCP is decidable in 
time quadratic in the size of the given family (U \, ..., Uk), where size is defined 
linear in the size of the Biichi Automaton for ^h(li). 
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Proof We first argue that we can construct the simplified system U[ efficiently. 
By definition, Vj > 0 : P/ C Let P* = IJj P/. Then, it is easy to see that, 

Vj > 0 : PJ C PJ+I and if P^ = P^+\ then Vz > j : P* = PL Also, Vz : P* C 
U; Si . Thus to evaluate sets P/, for all j, it suffices to evaluate them for values of 
j < ^i I S'/]. Furthermore, given P/ to evaluate P/'*’^, it suffices to make a pass 
through all transitions leading to states in Si \ Pi to check if a guard leading to 
any of these states contains a state in IJj Pi . This can clearly be accomplished in 
time + l-Rjl)- The above remarks imply that evaluation of sets P/, can 

be done in time 0((X)j(l‘5'il + l-Rjl))^)- Furthermore, given p, whether Up has a 
nontrivial strongly connected component can be decided in time Od^pj + |Pp|) 
by constructing all strongly connected components of Up. Thus, determining 
whether such a p exists can be done in time 0(X)j(l‘5'il + l-Rjl))- 

The Reduction Theorem reduces the PMCP problem to model checking for 
the system where V = U.^ if for some r G [i..k], the transition 

graph for P,, has a nontrivial strongly connected component else V = U^. Now, 
{U'i,VY^'^^ \= Ah{li) iff \= Thus it suffices to check 

whether ^ E^h{li), for which we use the automata-theoretic ap- 

proach of [20]. We construct a Biichi Automaton B^h for ^/z(li), and check that 
language of the product Biichi Automaton P, of {Ui,VY^’^'> and B^h is non- 
empty (cf [14]). Since the nonemptiness check for V can be done in time linear in 
the size of P, and the size of {Ui , jg 0((^j.(|S'j| -|- |Pj|))^), we are done. 

□ 



4.3 Properties Ranging over Pairs of Processes from Two Classes 

Using similar kinds of arguments as were used in proving assertions in the sec- 
tions 4.1 and 4.2, we can prove the following results. 

Theorem 4.3.1 (Cutoff Theorem). 

Let f he Ah{ii,jm) or £.h{ii,jm), where h is an LTL\X formula and 

l,mG [l..fc]. Then we have the following 

V(zzi,...,nfc) L (!,...,!) : (Pi, . . . , Pfc)("i.-."'=) ^ f iff 
V(di, . . . , 4) L (ci, . . . , Cfc) : (Pi, ... , Pfc)(L....,<ifc) ^ 
where the cutoff (ci, . . . , Ck) is given by ci = |P/| -I- 2, Cm = |Pm| + 2 and for 

z yf d ^ 01 . Ci — I ffz I 4“ 1- 

Theorem 4.3.2 (Reduction Theorem). 

(Pi,...,Pfc)(----'=) h A.,.„„A/z(zpj^) h A.,.„„A/z(zpj^), 

where ci = |P/| -I- 2, Cm, = |Pm| + 2 and \/i l,m : Ci = \Ui\ + 1. 

Again, we get the analogous Decidability Theorem and Efficient Decidability 
Theorem. Moreover, we can specialize these results to apply when l=m. This 
permits reasoning about formulae of the type or 

for properties ranging over all pairs of processes in a single class 1. 
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5 Systems with Conjunctive Guards 

The development of results for conjunctive guards closely resembles that for dis- 
junctive guards. Hence, for the sake of brevity, we only provide a proof sketch 
for each of the results. 

Lemma 5.1 (Conjunctive Monotonicity Lemma). 

(i) Vn > 1 : (^ 1 ,^ 2 )^^’"^ h E/r(l 2 ) implies (Hi, ^ 2 )^^’"+^^ h ^HU). 

(ii) Vn> 1 : (^ 1 ,^ 2 )^^’"^ h implies (Hi, ^ 2 )^^’"+^^ h E/i(li). 

Proof Sketch The intuition behind this lemma is that for any computation x of 
(hi, V 2 )^^’”\ there exists an analogous computation y of (Hi, wherein 

the (n-l- l)st copy of template process V 2 stutters in its initial state and the rest 
of the processes behave as in a; . □ 

Lemma 5.2 (Conjunctive Bounding Lemma). 

(i) Vn> 21^21 + 1: (Ci,C 2 )(^’") h (Vf , ^ 2 )^^’"^^ where 

02 = 21^21 + 1. 

(ii) Vn> 2|C2| : {Vi,V2)^^w) ^ Eh(li) z# (Hi, h EMC). 

Proof Sketch 

Let X be an infinite computation of (Vi, 12)^^’”^- Set v = (i 2 )‘^, where i 2 is the 
initial state of V 2 - If none of a;(l, 1) or a;(2, 1) is an infinite local computation then 
there exists I yf 1 such that a;(2, 1) is an infinite local computation. In that case, 
reset v = x{2,l). Construct a formal sequence y of (Vi,V2)^^’^^^ as follows. Set 
y(l, 1) = a;(l, 1), y(2, 1) = a;(2, 1), y(2, 2 ) = v and Vj G [3..C2] : y(2, j) = (i2)‘^. 
Then, it can be proved that y is a stuttering of a valid infinite computation of 

Now consider the case when x = xoXi...Xd is a deadlocked computation 
sequence of {Vi, ¥2)^^^)^ Let S = Set{xd) H 82- For each s £ S, define an index 
set Is as follows. If there exists a unique indexed copy Xd{2, in) of s in Xd set 
Is = {in} else set Is = {ini, m2}, where Xd{2,ini) and Xd{2,in2) are indexed 
copies of s and in\ yf in2- Let I = Also, for index j and global state s 

define Set{s,j) = {t\t £ Si^J 82 and t has a copy with index other than j in s} 
Construct a formal sequence y = yo,---,yd of states of (Vi, 12 )^^’^'^^'''’^^ by 
projecting each global state Xi onto process 1 coordinate of Vi and process index 
coordinate of V 2 > where index = 1 or index G I. From our construction, it follows 
that for all j, 8et{yj) C 8et{xj). Hence all transitions (yj,yi+i) are valid. Also 
for each i £ [l..(2|V2| + 1)], there exists j G [l..n] such that y(2, i) is a projection 
of x{2,j). Then, from our construction, it follows that 8et{xd,j) = 8et{yd,i) 
and thus process V2 is deadlocked in yd iff V2 is deadlocked in Xd- Then from 
the fact that Xd is deadlocked, we can conclude that yd is a deadlocked state and 
hence y is a stuttering of a deadlocked computation of {Vi, V 2 )^^’’^^^- 

In both cases, when constructing y from x, we preserved the local computa- 
tion sequence of process V2 ■ This path correspondence gives us the result. □ 
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Again as before, the following lemma allows reduction in system size over 
multiple coordinates simultaneously (2 coordinates for notational brevity). 

Lemma 5.3 (Conjunctive Truncation Lemma). 

Vm,n2 > 1 : (C/i,C/2)("i>"^) h EMI 2 ) z#(C/i,C/2)("'v"^) h EMI 2 ), 
where n '2 = min{ri 2 , 2 |C/ 2 | + 1) and n[ = min{ni, 2|C/i|). 

Proof Idea 

Use the Conjunctive Bounding Lemma and associativity of the 1 1 operator. □ 

Theorem 5.1 (Conjunctive Cutoff Result). 

Let f be ^h{ii) or Eh{ii), where h is a LTL\X formula and I £ [1--2]. 

Then we have the following 

V(m,n2) ^ (1,1) : (C/i,C/2)("i'"^) h/ # 

V(di,d2) ^ (ci,C2) : (C/i,C/2)(‘^i-‘^^) h/, 
where the cutoff (ci , C2) is given by ci = 2|{7/| + 1, and for i ^ I : Ci = 2\Ui\. 

Proof Sketch Follows easily from the Truncation Lemma. □ 

More generally, for systems with fc > 1 class of processes we have 

Theorem 5.2 (Conjunctive Cutoff Theorem) . 

Let f be ^h{ii) or Eh{ii), where h is a LTL\X formula and I £ [l..fc]. 

Then we have the following 

V(m,...,nfc)^ (!,...,!): (C/i, ..., C/fc)("i--"'=) h / ^ff 
V(di,...,dfc) ^ (ci,...cfc) : (C/i,...,C/fc)('^i--‘^'=) h/, 
where the cutoff {c\, ...,Cfc) is given by ci = 2\Ui\ + 1, and for i ^ I : Ci = 2\Ui\. 

Although the above results yield decidability for PMCP in the Conjunctive 
guards case, the decision procedures are not efficient. 

We now show that if we limit path quantification to range over infinite paths 
only (i.e. ignore deadlocked paths); or finite paths only; then we can give an 
efficient decision procedure for this version of the PMCP. We use A;nf for “for 
all infinite paths” , E;nf for “for some infinite path” , Afi„ for “for all finite paths” , 
and Efin for “for some finite path” . 

Theorem 5.3 (Infinite Conjunctive Reduction Theorem). 

For any LTL\X formula h and I £ [l..fc], we have 

(z;V(m,...,nfc)^ (!,...,!) : (C/i , ..., C/fc)("i--"'=) h A., iff 

(IL r/, u P. 

(ii) V(n’i,.’..,nfc) ^ (!,...,!) : (C/i, ..., C/fc)("--"'=) h A., ^nMll), € 
(C/l,...,C/fc)("i--"'=) h ^nMh), 
where (ci, = (1, 1). 



i 
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Proof Sketch 

To obtain (a), by appeal to symmetry, it suffices to establish that for each 
(ni,...,rifc) ^ : (C/i, h iff (C/i, 

^ Einfh(l/). Using the duality between Ai„f and E;nf on both sides of the latter 
equivalence, we can also appeal to symmetry to obtain (b). We establish the 
latter equivalence as follows. 

(=^) Let X = xq xi . . . denote an infinite computation of 
(Ui , where bi indicates which process fired the transition driving 
the system from global states Xi to Xi+\ and gi is the guard enabling the tran- 
sition. Since x is infinite, it follows that there exists some process such that the 
result of projecting x onto that process results in a stuttering of an infinite local 
computation of the process. By appeal to symmetry, we can without loss of gener- 
ality, assume that for each process class Up, if a copy of Up in {Ui , ■■■■,"»=) 
has the above property then that copy is in fact the concrete process Up in case 
p yf I and the concrete process Up in case p = I and local computation x{l, 1) is 
finite. 

Define a (formal) sequence y = yo yi ■ ■ ■ by projecting each global 
state Xi onto process 1 coordinate for each class Up for p ^ I and onto process 
coordinates 1 and 2 for process class Ui to get a state yi. We let 6' = 1/ if bi 
= U, 6' = 2/ if = 2/, else set 6' = e, while g[ is the syntactic guard resulting 
from gi by deleting all conjuncts corresponding to indices not preserved in the 
projection. Then, by our construction and the fact that x was an infinite com- 
putation, we have that y denotes a stuttering of a genuine infinite computation 
of (Ui, ..., " To see this, note that for any i such that yi yf yi+i, the 

associated (formal) transitions have their guard g[ true, since for conjunctive 
guards gi and their projections g' we have Xi ^ gi implies yi ^ g', and can thus 
fire in (C/i, ..., For any stuttering i where yi = j/i+i, the (formal) 

transition is labelled by 6' = e. 

Thus, given infinite computation path of {U \, ..., {7fc)^"^’ " ’"'“\ there exists a 
stuttering of an infinite computation path of {U \, ..., such that the 

local computation path of U^ is the same in both. This path correspondence 
proves the result. 

(<J=) Let y = yo, yi, ■■■ be an infinite computation path of {Ui, 

Then, consider the sequence of states = Xq,xi,...,, where a;(^, 1) = y{l,l), 
x{l,2) = y{l,2) and V(fc,j) yf {1,1), {1,2) : x{k,j) = (i-(,)“. Let gi be the guard 
labelling the transition ^ i\ in state yi. Then all the other processes are in 
their initial states in Xi, and since the guards do allow initial states of all tem- 
plate processes as “nonblocking” states in that their being present in the global 
state does not falsify any guards, we have Xi \= gi- 

Thus, given infinite computation path y of {U\, there exists 

an infinite computation path x of {U \, ..., such that the local com- 

putation path of Ul is the same in both. This path correspondence easily gives 
us the desired result. □ 




Reducing Model Checking of the Many to the Few 251 



In a similar fashion, we may prove the following result. 

Theorem 5.4 (Finite Conjunctive Reduction Theorem). 

For any LTL\X formula h, and I G we have 

(i) V(m, . . . , rifc) ^ (1, . . . , 1) : (C/i, ■■■."»=) ^ /\.^ Efinh(z/), iff 

(ii) V(m,...,nfc) ^ : (C/i, |= Ai, Afinh(t/), iff 

(C/l,...,C/fc)(l--D h AfinMl/). 

Note that the above theorem permits us to verify safety properties efficiently. 
Informally, this is because if there is a finite path leading to a “bad” state in 
the system (C/i, ..., then there exists a finite path leading to a bad 

state in (C/i, ..., " Thus, checking that there is no finite path leading to 

bad state in (C/i, ..., {7j,)("i- ->"'=) reduces to checking it for (C/i, ..., 

We can use this to obtain an Efficient Conjunctive Decidability Theorem. 
Moreover, the results can be readily extended to formulae with multiple indices 
as in the disjunctive guards case. 

6 Applications 

Here, we consider a solution to the mutual exclusion problem. The template pro- 
cess is given below. Initially, every process is in local state IV, the non-critical 



N T C 




region. U = T \J N \/ C denotes the universal guard, which is always true in- 
dependent of the local states of other processes. If a process wants to enter the 
critical section C, it goes into the trying region T which it can always do since 
U is always true. Guard G = N V T, instantiated for process z of n processes, 
takes the conjunctive form V Tj). When G is true, no other process is in 

the critical section, and the transition from T to G can be taken. Note that all 
guards are conjunctive with neutral(i.e., non-blocking) initial state N. Thus, by 
the Finite Conjunctive Reduction Theorem for multi-indexed properties, PMCP 
for all sizes n with the mutual exclusion property Az j AfinG^(C'i A Gj) can 
be reduced to checking a 2-process instance. Using the Conjunctive Cutoff The- 
orem, the starvation-freedom property Az A(G(Tz ^ FCz)) can be checked by a 
7-process instance. In this simple example, mutual exclusion is maintained but 
starvation-freedom fails. 
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7 Concluding Remarks 

PMCP is, in general, undecidable[l]. However, under certain restrictions, a va- 
riety of positive results have been obtained. Early work includes [15] which uses 
an abstract graph of exponential size “downstairs” to capture the behaviour of 
arbitrary sized parameterized asynchronous programs “upstairs” over Fetch-and- 
Add primitives; however, while it caters for partial automation, the completeness 
of the method is not established, and it is not clear that it can be made fully 
automatic. A semi-automated method requiring construction of a closure process 
which represents computations of an arbitrary number of processes is described 
in [4]; it is shown that, if for some k, C\\U’^ is appropriately bisimilar to 
then it suffices to check instances of size at most k to solve the PMCP. But it is 
not shown that such a cutoff k exists, and the method is not guaranteed to be 
complete. Kurshan and McMillan [13] introduce the related notion of a process 
invariant (cf. [22]). Ip and Dill [12] describe another approach to dealing with 
many processes using an abstract graph; it is sound but not guaranteed to be 
complete; [18] proposes a similar construction for verification of safety properties 
of cache coherence protocols, which is also sound but not complete. A theme is 
that most these methods suffer, first, from the drawback of being only partially 
automated and hence requiring human ingenuity, and, second, from being sound 
but not guaranteed complete (i.e., a path “upstairs” maps to a path “down- 
stairs”, but paths downstairs do not necessarily lift). Other methods can be 
fully automated but do not appear to have a clearly defined class of protocols 
on which they are guaranteed to terminate successfully (cf. [5], [21], [19]). 

For systems comprised of CCS processes, German and Sistla [10] combine 
the automata-theoretic method with process closures to permit efficient solution 
to PMCP for single index properties, modulo deadlock. But efficient solution 
is only yielded for processes in a single class. Even for systems of the form 
C\\U'^ a doubly exponential decision procedure results, which likely limits its 
practical use. Emerson and Namjoshi [7] show that in a single class (or client- 
server) synchronous framework PMCP is decidable but with PSPACE-complete 
complexity. Moreover, this framework is undecidable in the asynchronous case. 
A different type of parameterized reasoning about time bounds is considered in 
[9]. 

In some sense, the closest results might be those of Emerson and Namjoshi 
[6] who for the token ring model, reduce reasoning, for multi-indexed tempo- 
ral logic formulae, for rings of arbitrary size to rings up to a small cutoff size. 
These results are significant in that, like ours, correctness over all sizes holds 
iff correctness of (or up to) the small cutoff size holds. But these results were 
formulated only for a single process class and, for a restricted version of the 
token ring model, namely one where the token cannot be used to pass values. 
Also, related are the results of Attie and Emerson [2] . In the context of program 
synthesis, rather than program verification, it is shown how certain 2-process 
solutions to synchronization problems could be inflated to n-process solutions. 
However, the correspondence is not an “ifP , but is established in only one di- 
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rection for conjunctive- type guards. Disjunctive guards are not considered, nor 
are multiple process classes. 

We believe that our positive results on PMCP are significant for several 
reasons. Because PMCP solves (a major aspect of) the state explosion problem 
and the scalability problem in one fell swoop, many researchers have attempted 
to make it more tractable, despite its undecidability in general. Of course, PMCP 
seems to be prone to undecidability in practice as well, as is evidenced by the 
wide range of solution methods proposed that are only partially automated or 
incomplete or lack a well-defined domain of applicability. Our methods are fully 
automated returning a yes/no answer, they are sound and complete as they 
rely on establishing exact (up to stuttering) correspondences (yes upstairs iff 
yes downstairs). In many cases, our methods are efficient, making the problem 
genuinely tractable. An additional advantage, is that downstairs we have a small 
system of cutoff size that, but for its size, looks like a system of size n. This 
contrasts with methods that construct an abstract graph downstairs which may 
have a complex and non-obvious organization. 
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Abstract. This^ work presents a minimization algorithm. The algo- 
rithm receives a Kripke structure M and returns the smallest structure 
that is simulation equivalent to M. The simulation equivalence relation 
is weaker than bisimulation but stronger than the simulation preorder. 
It strongly preserves ACTL and LTL (as sub-logics of ACTL*). 

We show that every structure M has a unique up to isomorphism reduced 
structure that is simulation equivalent to M and smallest in size. 

We give a Minimizing Algorithm that constructs the reduced structure. It 
first constructs the quotient structure for M, then eliminates transitions 
to little brothers and finally deletes unreachable states. 

The first step has maximal space requirements since it is based on the 
simulation preorder over M. To reduce these requirements we suggest 
the Partitioning Algorithm which constructs the quotient structure for 
M without ever building the simulation preorder. The Partitioning Al- 
gorithm has a better space complexity but might have worse time com- 
plexity. 



1 Introduction 

Temporal logic model checking is a method for verifying finite-state systems 
with respect to propositional temporal logic specifications. The method is fully 
automatic and quite efficient in time, but is limited by its high space require- 
ments. Many approaches to beat the state explosion problem of model checking 
have been suggested, including abstraction, partial order reduction, modular 
methods, and symmetry ([8]). All are aimed at reducing the size of the model 
(or Kripke structure) to which model checking is applied, thus, extending its 
applicability to larger systems. 

Abstraction methods, for instance, hide some of the irrelevant details of a 
system and then construct a reduced structure. The abstraction is required to 
be weakly preserving, meaning that if a property is true for the abstract structure 
then it is also true for the original one. Sometimes we require the abstraction 
to be strongly preserving so that, in addition, a property that is false for the 
abstract structure, is also false for the original one. 

In a similar manner, for modular model checking we construct a reduced 
abstract environment for a part of the system that we wish to verify. In this case 

^ The full version of this paper including proofs of correctness can be found in [6]. 
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as well, properties that are true (false) of the abstract environment should be 
true (false) of the real environment. 

It is common to define equivalence relations or preorders on structures in or- 
der to reflect strong or weak preservation of various logics. For example, language 
equivalence (containment) strongly (weakly) preserves the linear-time temporal 
logic LTL. Other relations that are widely used are the bisimulation equiva- 
lence [15] and the simulation preorder [14]. The former guarantees strong preser- 
vation of branching-time temporal logics such as CTL and CTL* [7] . The latter 
guarantees weak preservation of the universal fragment of these logics (ACTL 
and ACTL* [10]). 

Bisimulation has the advantage of preserving more expressive logics. How- 
ever, this is also a disadvantage since it requires the abstract structure to be too 
similar to the original one, thus allowing less powerful reductions. The simulation 
preorder, on the other hand, allows more powerful reductions, but it provides 
only weak preservation. Language equivalence provides strong preservation and 
large reduction, however, its complexity is exponential while the complexity to 
compute bisimulation and simulation is polynomial. 

In this paper we investigate the simulation equivalence relation that is weaker 
than bisimulation but stronger than the simulation preorder and language equiv- 
alence. Simulation equivalence strongly preserves ACTL*, and also strongly pre- 
serves LTL and ACTL as sublogics of ACTL* . Both ACTL and LTL are widely 
used for model checking in practice. 

As an equivalence relation that is weaker than bisimulation, it can derive 
smaller minimized structure. For example, the structure in part 2 of Figure 1 is 
minimized with respect to simulation equivalence. In comparison, the minimized 
structure with respect to bisimulation is the structure in part 1 of Figure 1 and 
the minimized structure with respect to language equivalence is the structure in 
part 3 of Figure 1 . 




Fig. 1. Different minimized structures with respect to different equivalence re- 
lations 
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Given a Kripke structure M, we would like to find a structure M' that is 
simulation equivalent to M and is the smallest in size (number of states and 
transitions) . 

For bisimulation this can be done by constructing the quotient structure 
in which the states are the equivalence classes with respect to bisimulation. 
Bisimulation has the property that if one state in a class has a successor in 
another class then all states in the class have a successor in the other class. Thus, 
in the quotient structure there will be a transition between two classes if every 
(some) state in one class has a successor in the other. The resulting structure is 
the smallest in size that is bisimulation equivalent to the given structure M. 

The quotient structure for simulation equivalence can be constructed in a 
similar manner. There are two main difficulties, however. First, it is not true 
that all states in an equivalence class have successors in the same classes. As a 
result, if we define a transition between classes whenever all states of one have 
a successor in the other, then we get the V— quotient structure. If, on the other 
hand, we have a transition between classes if there exists a state of one with a 
successor in the other, then we get the 3— quotient structure. Both structures are 
simulation equivalent to M, but the V— quotient structure has fewer transitions 
and therefore is preferable. 

The other difficulty is that the quotient model for simulation equivalence 
is not the smallest in size. Actually, it is not even clear that there is a unique 
smallest structure that is simulation equivalent to M. 

The first result in this paper is showing that every structure has a unique 
up to isomorphism smallest structure that is simulation equivalent to it. This 
structure is reduced, meaning that it contains no simulation equivalent states, 
no little brothers (states that are smaller by the simulation preorder than one 
of their brothers), and no unreachable states. 

Our next result is presenting the Minimizing Algorithm that given a structure 
M constructs the reduced structure for M . Based on the maximal simulation 
relation over M, the algorithm first builds the V— quotient structure with re- 
spect to simulation equivalence. Then it eliminates transitions to little brothers. 
Finally, it removes unreachable states. The time complexity of the algorithm is 
O(IS'p). Its space complexity is Od^p) which is due to the need to hold the 
simulation preorder in memory. 

Since our main concern is space requirements, we suggest the Partitioning 
Algorithm which computes the quotient structure without ever computing the 
simulation preorder. Similarly to [13], the algorithm starts with a partition Sq 
of the state space to classes whose states are equally labeled. It also initializes 
a preorder Hq over the classes in Aq. At iteration z -|- 1, Si+i is constructed by 
splitting classes in A^. The relation is updated based on Aj, Aj+i and Hi. 

When the algorithm terminates (after k iterations) A^ is the set of equiva- 
lence classes with respect to simulation equivalence. These classes form the states 
of the quotient structure. The final Hk is the maximal simulation preorder over 
the states of the quotient structure. Thus, the Partitioning Algorithm replaces 
the first step of the Minimizing Algorithm . Since every step in the Minimizing 
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Algorithm further reduces the size of the initial structure, the first step handles 
the largest structure. Therefore, improving its complexity influences most the 
overall complexity of the algorithm. 

The space complexity of the Partitioning Algorithm is 0{\Sk\‘^ + IS”!- 
log{\Sk\))- We assume that in most cases \Sk\ « |«S'|, thus this complexity 
is significantly smaller than that of the Minimizing Algorithm . Unfortunately, 
time complexity will probably become worse (depending on the size of Sk). It 
is bounded by Od-S'p • \Sk\^ ■ (|T'fcp + \R\)). However, since our main concern is 
the reduction in memory requirements, the Partitioning Algorithm is valuable. 

Other works also suggest minimization algorithms. In [13], the quotient struc- 
ture with respect to bisimulation is constructed without first building the bisim- 
ulation relation. We follow a similar approach. However, in our case states may 
remain in the same class even when they do not have successors in the same 
classes. Thus, our analysis is more complicated and requires both Ei and Hi. 
Symbolic bisimulation minimization is suggested in [5] . In [4] a minimized struc- 
ture with respect to bisimulation is generated directly out of the text. In [9] a 
bisimulation minimization is applied to the intersection of the system automaton 
and the specification automaton. The algorithm from [13] is used. [12] shows that 
eliminating little brothers results in a simulation equivalent structure. However, 
the paper does not consider the minimization problem. 

Several works minimize a structure in a compositional way, preserving lan- 
guage containment [2] or a given CTL formula [1]. Minimizing with respect to 
a given formula may result in a more power reduction, however it requires to 
determine the checked formula in advance. 

The rest of the paper is organized as follows. Section 2 gives our basic defi- 
nitions. Section 3 defines reduced structures and shows that every structure has 
a unique simulation equivalent reduced structure. Section 4 presents the Mini- 
mizing Algorithm . Finally, Section 5 describes the Partitioning Algorithm and 
discusses its space and time complexity. 

2 Preliminaries 

Let AP be a set of atomic propositions. A Kripke structure M over AP is a four 
tuple M = {S, So, R, L) where S' is a finite set of states; sq G S is the initial 
state; RC S x S is the transition relation that must be total, i.e., for every state 
s G S there is a state s' G S such that R{s, s'); and L : S ^ 2 ^^ is a function 
that labels each state with the set of atomic propositions true in that state. 

The size \M\ of a Kripke structure M is the pair (jSl,]i?l). We say that 
\M\ < \M'\ if [S] < ]S'l or [S] = ]S'l and [i?] < Ji?']. 

Given two structures M and M' over AP, a relation H C S x S' is a simu- 
lation relation [14] over M x M' iff the following conditions hold: 

1- (sO) Sq) S H ■ 

2. For all (s, s') G H, L{s) = L'{s') and 

Vt[(s, t)GR^ 3f'[(s', t') G A' A {t, t') G H]]. 
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We say that M' simulates M (denoted by M ^ M') if there exists a simulation 
relation H over M x M' . 

The logic ACTL* [10] is the universal fragment of the powerful branching- 
time logic CTL*. ACTL* consists of the temporal operators X (next-time), U 
(until) and R (release) and the universal path quantifier A (for all paths). For 
lack of space the formal definition is omitted. It can be found in [8]. 

The following lemma and theorem have been proven in [10]. 

Lemma 1. :< is a preorder on the set of structures. 

Theorem 2. Suppose M ^ M' . Then for every ACTL* formula f, M' \= f 
implies M \= f. 

Given two Kripke structures M,M', we say that M is simulation equivalent to 
M' iff M ^ M' and M' ^ M. It is easy to see that this is an equivalence relation. 
By Theorem 2 , if M and M' are simulation equivalent then they are equivalent 
with respect to ACTL* . However, they are not equivalent with respect to CTL* . 

A simulation relation H over M x M' is maximal iff for all simulation relations 
H' over M x M' , H' C H. In [10] it has been shown that if there is a simulation 
relation over M x M' then there is a unique maximal simulation over M x M' . 



3 The Reduced Structure 

Given a Kripke structure M, we would like to find a reduced structure that will 
be simulation equivalent to M and smallest in size. In this section we show 
that a reduced structure always exists. Furthermore, we show that all reduced 
structures of M are isomorphic to each other. 

Let M be a Kripke structure. The maximal simulation relation over M x M 
always exists and is denoted by Hm- We need the following two definitions in 
order to characterize reduced structures. 

Two states si,S2 G M are simulation equivalent iff (si,S2) G Hm and 
(s2, si) G Hm- 

A state Si is a little brother of a state S2 iff there exists a state S3 such that: 

- (s3j S2) G R and (S3, si) G R. 

- (si, S2) G Hm and (s2, si) ^ Hm- 

Definition 3. A Kripke structure M is reduced if: 

1. There are no simulation equivalent states in M . 

2. There are no states si,S 2 such that si is a little brother of S 2 - 

3. All states in M are reachable from sq- 



Theorem 4. .■ Let M M' be two reduced Kripke structures. Then the following 
two statements are equivalent: 
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1. M and M' are simulation equivalent. 

2. M and M' are isomorphic. 

The proof that 2 implies 1 is straight forward. In the rest of this section 
we assume that M and M' are reduced Kripke structures. We will show that if 
M ^ M' and M' < M then M and M' are isomorphic. 

We use Hmm' and Hm'm to denote the maximal simulation relations over 
M X M' and M' x M respectively. The composed relation Hmm'm C S' x S' is 
defined by Hmm'm = {(si, S2)|3s' G S', (si, s') G Hmm' A (s', S 2 ) G Hm'm}- 

Lemma 5. The composed relation Hmm'm is a simulation relation. 

For the reduced Kripke structures M and M' , we define the matching relation 
f C S' X S as follows: 

(s', s) G / iff (s', s) G Hm’m and (s, s') G Hmm'- 

We show that / is an isomorphism between M' and M, i.e., / is an one to 
one and onto total function that preserves the state labeling and the transition 
relation. 

Lemma 6. Let f C S' x S be the matching relation. Then f is an one to one, 
onto, and total function from S' to S. 

Proof Sketch : First we need to prove that / is a function from S' to S. We 
assume to the contrary that there are different states si, S 2 G S and s' G S' such 
that (s', si) G / and (s', S 2 ) G /. We show that (si, S 2 ) G Hmm'm and (s 2 , si) G 
Hmm'm- Since Hmm'm is included in Hm, this contradicts the assumption that 
M is reduced. The proof that f~^ is a function from S to S' is similar. Thus, 
we conclude that / is one to one. 

Next, we prove that / is onto, i.e. for every state s in S there exists a state 
s' in S' such that (s', s) G /. The proof is by induction on the distance of s G S 
from the initial state, (since all states are reachable, the distance is bounded by 
|S|). Again we use the composed relation Hmm'm to show that if / is not onto 
then M' is not reduced. 

Similarly, we can show that f~^ is onto and therefore / is total. □ 

Lemma 7. For all s' G S', L'(s') = L(f(s')). Furthermore, for all s', , So G S', 
{s'„s'^)gR' iff {f{s',)J{s'^))GR. 

Thus, we conclude Theorem 4 . 

Theorem 8. Let M he a non-reduced Kripke structure, then there exists a 
reduced Kripke structure M' such that M, M' are simulation equivalent and 
\M'\ < \M\. 

In order to prove Theorem 8 , we present in the next sections an algorithm 
that receives a Kripke structure M and computes a reduce Kripke structure M' , 
which is simulation equivalent to \M\, such that \M'\ < \M\. Moreover, if M is 
not reduced then \M'\ <\M\. 

Lemma 9. Let M' he a reduced Kripke structure. For every M that is simula- 
tion equivalent to \M'\, if M and M' are not isomorphic then \M'\ < \M\. 
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4 The Minimizing Algorithm 

In this section we present the Minimizing Algorithm that gets a Kripke structure 
M and computes a reduced Kripke structure M' which is simulation equivalent 
to M and \M'\ < \M\. If M is not reduced then \M'\ < \M\. 

The algorithm consists of three steps. First, a quotient structure is con- 
structed in order to eliminate equivalent states. The resulting quotient model is 
simulation equivalent to M but may not be reduced. The next step disconnects 
little brothers and the last one removes all unreachable states. 

In each step of the algorithm, if the resulting structure differs from the orig- 
inal one then the resulting one is strictly smaller than the original structure. 



4.1 The V— Quotient Structure 

In order to compute a simulation equivalent structure that contains no equiva- 
lent states, we compute the V— quotient structure with respect to the simulation 
equivalence relation. We fix M to be the original Kripke structure. We denote 
by [s] the equivalence class which includes s. 

Definition 10. The quotient structure Mq =< Sq,Rq,SQ^,Lq > of M is 
defined as follow: 

- Sq is the set of the equivalence classes of the simulation equivalence. (We 
will use Greek letters to represent equivalence classes). 

- Rq = {(oi, Qf2)|Vsi G ai 3s2 G 02 - (si, S 2 ) G R} 

- So, = [so] • 

- Lq{[s]) = L{s). 

The transitions in Mq are V-transitions, in which there is a transition between 
two equivalence classes iff every state of the one has a successor in the other. We 
could also define 3-transitions, in which there is a transition between classes if 
there exists a state in one with a successor in the other. Both definitions result 
in a simulation equivalent structure. However, the former has smaller transition 
relation and therefore it is preferable. 

Note that, [S',! < [S'! and \Rq\ < |i?|.If = [S'], then every equivalence class 

contains a single state. In this case, Rq is identical to R and Mq is isomorphic 
to M. Thus, when M and Mq are not isomorphic, < [S']. 

Next, we show that M and Mq are simulation equivalent. 

Definition 11. Let G C S be a set of states. A state Sm & G is maximal in G 
iff there is no state s € G such that (sm, s) G Hm and (s, Sm) ^ Hm. 

Definition 12. Let a he a state of Mq, and ti a successor of some state in a. 
The set G{a, ti) is defined as follow: 



G(o, ti) — {t2 G 51352 G o A ( 52 ,^ 2 ) G i? A (^ 1 ,^ 2 ) G Hm}- 




262 Doron Bustan and Orna Grumberg 



Intuitively, G{a, ti) is the set of states that are greater than ti and are successors 
of states in a. Notice that since all state in a are simulation equivalent, every 
state in a has at least one successor in G(a,ti). 

Lemma 13. Let a,ti he as defined in Definition 12 . Then for every maximal 
state tm in G(a,ti), [tm] is a successor of a. 

Proof : Let tm be a maximal state in G(a,fi), and let Sm & ex he a, state such 
that tm is a successor of Sm- We prove that for every state s G a, there exists a 
successor t G [tm], which implies that [tm] is a successor of a. 

s, Sm & cx implies (sm, s) G Hm- This implies that there exists a successor t of 
s such that {tm, t) G Hm- By transitivity of the simulation relation, {ti,t) G Hm- 
Thus t G G{a,ti). Since tm is maximal in G(a, G), {t,tm) G Hm- Thus, t and 
tm are simulation equivalent and t G [tm] - □ 

Theorem 14. The structures M and Mq are simulation equivalent- 

Proof Sketch : It is straight forward to show that H' = {(a, s)|s G a} is a 
simulation relation over Mq x M . Thus, Mq < M . 

In order to prove that M ^ Mq we choose H' = {(si, a)| there exists a state 
S2 G a such that (si, S2) G Hm}- Clearly, (sq, sq,) G H' and for all (s, a) G H' , 
L{s) = Lq{a)- 

Assume (si,ai) G H' and let ti be a successor of si. We prove that there 
exists a successor of ai such that (G, 02) G H'. We distinguish between two 
cases: 

1. Si G ai- Let tm be a maximal state in G(ai,G)j then Lemma 13 implies 
that (ai, [tm]) G Hq- Since tm is maximal in G(ai,ti), (ti,tm) G Hm which 
implies (ti, [tm]) G H'. 

2. Si ^ ai- Let S2 G oi be a state such that (si,S2) G Hm- Since (si,S2) G 

Hm there is a successor t^ of S2 such that {ti.t^) G Hm- The first case 
implies that there exists an equivalence class cx2 such that {cxi, 02) G Rq and 
(^2,02) G H'. By {t2,a2) G H' we have that there exists a state t^ G 02 
such that (^2,^3) G Hm- By transitivity of simulation (G,G) G Hm- Thus, 
{ti, 0:2) G H' . □ 

4.2 Disconnecting Little Brothers 

Our next step is to disconnect the little brothers from their fathers. As a result 
of applying this step to a Kripke structure M with no equivalent states, we get 
a Kripke structure M' satisfying: 

1. M are M' are simulation equivalent. 

2. There are no equivalent states in M' . 

3. There are no little brothers in M' . 

4. |M'| < |M|, and if M and M' are not identical, then |M'| < |M|. 
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change := true 

while (change = true) do 

Compute the maximal simulation relation Hm 
change := false 

If there are Si,S2,Ss € S such that Si is a little brother of S2 
and S3 is the father of both Si and S2 then 
change := true 
R — R \ {(s3, si)} 

end 

end 



Fig. 2. The Disconnecting Algorithm. 



In Figure 2 we present an iterative algorithm which disconnects little brothers 
and results in M'. 

Since in each iteration of the algorithm one edge is removed, the algorithm 
will terminate after at most \R\ iterations. We will show that the resulting struc- 
ture is simulation equivalent to the original one. 

Lemma 15. Let M' =< S' , R' , s'f^, L' > he the result of the Disconnecting Al- 
gorithm on M . Then M and M' are simulation equivalent. 

Proof Sketch : We prove the lemma by induction on the number of iterations. 
Base: at the beginning M and M are simulation equivalent. 

Induction step: Let M" be the result of the first i iterations and H" be the 
maximal simulation over M" x M" . Let M' be the result of the (i-l- l)th iteration 
where It! = R!' \ {(s", s^O}- Assume that M and M" are simulation equivalent. 
It is straight forward to see that H' = {(s^, ^ 2 ) ^ ^"} is a simulation 

relation over M' x M" . Thus, M' ^ M" . 

To show that M" ^ M' we prove that H' = {(s" , S2)l(sn S2O G H"} is ^ 
simulation relation. Clearly, (sq,Sq) € H' and for all (s", S2) G H" , L"(s") = 

Suppose (s", S2) G H' and is a successor of s". Since H" is a simulation 
relation, there exists a successor t'f of S2 such that (t'{,t'f) G H" . This implies 
that (t", t' 2 ) G H'. If (s2, G R' then we are done. Otherwise, (s2 , t'f) is removed 
from R" because t'f is a little brother of some successor t'f of s'f. Since {s'fA'i) 
is the only edge removed at the (z-l- l)th iteration, (s2,ty G R' . Because t'f is a 
little brother of t'f then {t'f,t'f) G H" . By transitivity of the simulation relation, 
(C/, t'f) G H", thus {t'l, t' 3 ) G H'. □ 

We proved that the result M' of the Disconnecting Algorithm is simulation 
equivalent to the original structure M. Note that M' has the same set of states 
as M. We now show that the maximal simulation relation over M is identical to 
the maximal simulation relations for all intermediate structures M" (including 
M'), computed by the Disconnecting Algorithm. Since there are no simulation 
equivalent states in M, there are no such states in M' as well. 
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Lemma 16. Let M' =< S, R' , sq, L > be the result of the Disconnecting Algo- 
rithm on M and let H' C S' x S' be the maximal simulation over M' x M' . 
Then, Hm = H' . 

The lemma is proved by induction on the number of iterations. 

As a result of the last lemma, the Disconnecting Algorithm can be simplified 
significantly. The maximal simulation relation is computed once on the original 
structure M and is used in all iterations. If the algorithm is executed symbolically 
(with BDDs) then this operation can be performed efficiently in one step: 

R' = R - {(si, S2)|3 s 3 : (si, S3) e i? A (S2, S3) G Hm A (S3, S2) ^ Hm}- 



4.3 The Algorithm 

We now present our algorithm for constructing the reduced structure for a given 
one. 



1. Compute the V— quotient structure Mq of M and 

the maximal simulation relation Hm over Mq X Mq . 

2 . B! — Rq — {(si, S2)|3 s3 : (si, S3) € Rq A (s2, S3) G Hm} 

3 . Remove all unreachable states . 



Fig. 3. The Minimizing Algorithm 



Note that, in the second step we eliminate the check (s 3 ,S 2 ) ^ Hm- This 
is based on the fact that Mq does not contain simulation equivalent states. 
Removing unreachable states does not change the properties of simulation with 
respect to the initial states. The size of the resulting structure is equal to or 
smaller than the original one. Similarly to the first two steps of the algorithm, 
if the resulting structure is not identical then it is strictly smaller in size. 

We have proved that the result of the Minimizing Algorithm M' is simulation 
equivalent to the original structure M . Thus we can conclude that Theorem 8 
is correct. 

Figure 4 presents an example of the three steps of the Minimizing Algorithm 
applied to a Kripke structure. 

1 . Part 1 contains the original structure, where the maximal simulation relation 
is (not including the trivial pairs): 

{(2, 3), (3, 2), (11, 2), (11, 3), (4, 5), (6, 5), (7, 8), (8, 7), (9, 10), (10, 9)}. 

The equivalence classes are : {{1}, {2, 3}, {11}, {4}, {5}, {6}, {7, 8}, {9, 10}}. 

2. Part 2 presents the V— structure Mq. The maximal simulation relation Hm 
is (not including the trivial pairs): 

77m = {({11}, {2, 3}), ({4}, {5}), ({6}, {5})}. 

3. {11} is a little brother of {2,3} and {1} is their father. Part 3 presents the 
structure after the removal of the edge ({1}, {11}). 
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Fig. 4. An example of the Minimizing Algorithm 

4. Finally, part 4 contains the reduced structure, obtained by removing the 
unreachable states. 



4.4 Complexity 

The complexity of each step of the algorithm depends on the size of the Kripke 
structure resulting from the previous step. In the worst case the Kripke structure 
does not change, thus all three steps depend on the original Kripke structure. Let 
M be the given structure. We analyze each step separately (a naive analysis): 

1. First, the algorithm constructs equivalence classes. To do that it needs to 
compute the maximal simulation relation. [3,11] showed that this can be 
done in time OdS”] • |i?|). Once the algorithm has the simulation relation, the 
equivalence classes can be constructed in time O(jS'p). Next, the algorithm 
constructs the transition relation. This can be done in time OdS”] + |i?|). As 
a whole, building the quotient structure can be done in time Od^j • |i?|). 

2. Disconnecting little brothers can be done in O(jS'p). 

3. Removing unreachable states can be done in 0(|i?|). 

As a whole the algorithm works in time O(jS'p) 

The space bottle neck of the algorithm is the computation of the maximal 
simulation relation which is bounded by jiFp. 



5 Partition Classes 

In the previous section, we presented the Minimizing Algorithm . The algorithm 
consists of three steps, each of which results in a structure that is smaller in size. 
Since the first step handles the largest structure, improving its complexity will 
influence most the overall complexity of the algorithm. 

In this section we suggest an alternative algorithm for computing the set 
of equivalence class. The algorithm avoids the construction of the simulation 
relation over the original structure. As a result, it has a better space complexity, 
but its time complexity is worse. Since the purpose of the Minimizing Algorithm 
is to reduce space requirements, it is more important to reduce its own space 
requirement. 
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5.1 The Partitioning Algorithm 

Given a structure M, we would like to build the equivalence classes of the simu- 
lation equivalence relation, without first calculating Hm- Our algorithm, called 
the Partitioning Algorithm , starts with a partition Eq of S to classes. The classes 
in Eq differ from one another only by their state labeling. In each iteration, the 
algorithm refines the partition and forms a new set of classes. We use Ei to 
denote the set of the classes obtained after i iterations. In order to refine the 
partitions we build an ordering relation Hi over Ei x Ei which is updated in 
every iteration according to the previous and current partitions {Ei-i and Ei) 
and the previous ordering relation Initially, Hq includes only the identity 

pairs (of classes). 

In the algorithm, we use succ(s) for the set of successors of s. Whenever Ei 
is clear from the context, [s] is used for the equivalence class of s. We also use a 
function H that associates with each class a G Ei the set of classes a' G Ei-i 
that contain a successor of some state in a. 

n(a) = G a. (s,t) G Rj 

We use English letters to denote states, capital English letters to denote sets of 
states, Greek letters to denote equivalence classes, and capital Greek letters to 
denote sets of equivalence classes. The Partitioning Algorithm is presented in 
Figure 5 . 

Definition 17. The partial order <i on S is defined by: si <i S2 implies, 
L{si) = L{s 2) and if i > 0, Vti[(si,ti) G R ^ 3t2[(s2,t2) G RA ([G],[t2]) G 
Hi-i]]. In ease z = 0, si <o S2 iff L{si) = L{s2)- 
Two states si, S2 are z— equivalent iff s\ <i S2 and S2 <i si. 

In the rest of this section we explain how the algorithm works. There are 
three invariants which are preserved during the execution of the algorithm. 

Invariant 1: For all states si, S2 G S, si and S2 are in the same class a G Ei 
iff Si and S2 are z— equivalent. 

Invariant 2: For all states si, S 2 G S, si <i S2 iff ([si], [S 2 ]) G Hi. 

Invariant 3: Hi is transitive. 

Ei is a set of equivalence classes with respect to the z— equivalence relation. In 
the zth iteration we split the equivalence classes of Ei-i so that only states that 
are z-equivalent remain in the same class. 

A class a G Ei-i is repeatedly split by choosing an arbitrary state Sp G a 
(called the splitter) and identifying the states in a that are z— equivalent to Sp. 
These states form an z— equivalence class a' that is inserted to Ei. 

a' is constructed in two steps. First we calculate the set of states GT C a 
that contains all states Sg such that Sp <i Sg. Next we calculate the set of 
states LT C a that contains all states s/ such that s/ <i Sp. The states in the 
intersection of GT and LT are the states in a. that are z— equivalent to Sp. 

Hi captures the partial order <j, i.e., si <i S2 iff ([si], [S2]) G Hi. Note that 
the sequence - satisfies <o2<i2<2i2 — Therefore, if si <i S2 then 




Simulation Based Minimization 



267 



Initialize the algorithm: 

change : = true 

for each label a € 2'^ construct Oa £ such that s G Oa ^ L{s) — a. 

Ho = {(a,a)|a £ So} 

while change = true do begin 

change := false 

refine S: 

Ei+i 0 

for each a £ X'i do begin 
while a 7^ 0 do begin 

choose Sp such that Sp G a 

GT := {sglsg £ af\\/tp £ succ{sp) 3tg G succ{sg). ([tp], [tg]) £ Hi} 

LT := {si|si £ a AVti £ succ{si) 3tp £ succ{sp). ([tj], [tp]) £ Hi} 
a' ■.= GT n LT 

if then change := true 

a ■.= a\a' 

Add a as a new class to X'i+i . 

end 

end 

update H-. 

Hi+i = 0 

for every (01,02) £ Hi do begin 

for each 02,01 £ Ei+i such that 02 3 O2, Oi 11 Oi do begin 
>l> = {m^n{a'2){<P,0^Hi} 
if D H{a'i) then 

insert (01,02) to Hi+i 
else 

change := true 

end 

end 

end 

Fig. 5 . The Partitioning Algorithm 

Si <i-i S2- Thus, ([si],[s2]) £ Hi implies ([si],[s2]) € Based on that, 

when constructing Hi it is sufficient to check {a}, a'^) G Hi only in case 02 2 02> 
«i A a}, and (01,02) G Hi-\. 

For suitable o'l and 02 , we first construct the set of classes that are 
“smaller” than the classes in 77(0^. By checking if <7 A H{a[) we determine 
whether every class in H{a'i) is “smaller” than some class in 77(0^, in which 
case (o'i,oy is inserted to Hi. 

When the algorithm terminates, <i is the maximal simulation relation and 
the z— equivalence is the simulation equivalence relation over 717 x 717. Moreover, 
Hi is the maximal simulation relation over the corresponding quotient structure 
Mq. 

The algorithm runs until there is no change both in the partition Si and 
in the relation Hi. A change in Si is the result of a partitioning of some class 
a G Si. The number of changes in Si is bounded by the number of possible 
partitions, which is bounded by |S'|. 
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A change in Hi results in the relation <i+i which is contained in <i and 
smaller in size, i.e., | | > | <i+i |. The number of changes in Hi is therefore 

bounded by | <o |, which is bounded by I5P. Thus, the algorithm terminates 
after at most + [S'! iterations. Note that, it is possible that in some iteration 
i, Si will not change but Hi will, and in a later iteration j > i, Sj will change 
again. 

Example: In this example we show how the Partitioning Algorithm is applied 
to the Kripke structure presented in Figure 6 . 




Fig. 6. An example structure 



— We initialize the algorithm as follows: 

^0 = {q:oj /?0j 70j <5o}, Hq = {(ooj ao)j {Po, Po), (70j 7o)) (<^0) <5o)}> 
where oo = {0, 1, 2 }, Po = {3, 4, 5}, 70 = { 6 , 7}, <5o = { 8 , 9}. 

— The first iteration results in the relations: 

51 = {ofi, 02, Pl,P2, Ps, 70) <5o}. 

Hi = {(ofi, Qfi), (02, 02), {Pi, Pi), iP2,P2), {Ps, Ps), iPl,P2), {Ps, P2), (70, 7 o), 
(So, So)}, where ai = {0},«2 = {l,2},/3i = {3},/?2 = {S\,Po = {5}, 70 = 
{6,7},<5 o = { 8,9}. 

— The second iteration results in the relations: 

5 2 = {ai,a2, Pi,P2, Ps, 7i) 72) <5o}, 

H2 = {(01,0:1), (02, O2), (Pi, Pi), (P2,P2), (Ps, Ps), 

(Pi,P2), {P^,P2), (71,71), (72,72), (71,72), (<5o,<5o)}, 
where oi = |0},O2 = { 1 , 2 }, Pi = {i},P2 = [S},Po = {5}, 71 = {6}, 72 = 
{7},<5 o = { 8,9}. 

— The third iteration results in the relations: 

So = S2, Ho = H2 - change = false. 

The equivalence classes are: 

0^1 = {0}, 0^2 = {1, 2}, Pi = {3}, P2 = {4}, Po = {5}, 71 = {6}, 72 = {7}, Jo = 
{8,9} 



Since the third iteration results in no change to the computed partition or order- 
ing relation, the algorithm terminates. S2 is the final set of equivalence classes 
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which constitutes the set Sq of states of Mg. H 2 is the maximal simulation re- 
lation over Mq x Mq . The proof of correctness of the algorithm can be fount in 
the full version. 



5.2 Space and Time Complexity 

The space complexity of the Partitioning Algorithm depends on the size of Ei. 
We assume that the algorithm applied to Kripke structures with some redun- 
dancy, thus I Ail << 151. 

We measure the space complexity with respect to the size of the three fol- 
lowing relations: 

1. The relation R. 

2. The relations Hi whose size depends on Aj. We can bound the size of Hi by 

3. A relation that relates each state to its equivalence class. Since every state 
belongs to a single class, the size of this relation is 0(|5| • log{\Si\)). 

In the zth iteration we do not need to keep all Hq, H\, . . . and Aq, Ai, . . ., 
since we only refer to Hi, Hi+i and Ei, A^+i. By the above we conclude that the 
total space complexity is 0(|i?| -I- |Afcp -I- 15| • log{\Ek\)) 

In practice, we often do not hold the transition relation R in the memory. 
Rather we use it to provide, whenever needed, the set of successors of a given 
state. Thus, the space complexity is 0(|Afcp -I- |5| • log{\Ek\)). Recall that the 
space complexity of the naive algorithm for computing the equivalence classes of 
the simulation equivalence relation is bounded by |5p, which is the size of the 
simulation relation over M x M . In case | Afc| << |5|, the Partitioning Algorithm 
achieve a much better space complexity. 

As we already mentioned, the algorithm runs at most |5p iterations. In 
every iteration it performs one refine and one update, refine can be done in 
0(|Afcp -I- |Afc| • |i?|) and update can be done in 0(|Afcp • (|Afcp -I- \R\)). Thus 
the total time complexity is 0(|5p • |Afcp • (|Afcp -I- \R\))- 
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Abstract. On a case study, we present a new approach for verifying 
cryptographic protocols, based on rewriting and on tree automata tech- 
niques. Protocols are operationally described using Term Rewriting Sys- 
tems and the initial set of communication requests is described by a tree 
automaton. Starting from these two representations, we automatically 
compute an over-approximation of the set of exchanged messages (also 
recognized by a tree automaton). Then, proving classical properties like 
confidentiality or authentication can be done by automatically showing 
that the intersection between the approximation and a set of prohibited 
behaviors is the empty set. Furthermore, this method enjoys a simple 
and powerful way to describe intruder work, the ability to consider an 
unbounded number of parties, an unbounded number of interleaved ses- 
sions, and a theoretical property ensuring safeness of the approximation. 



Introduction 

In this paper, we present a new way of verifying cryptographic protocols. We 
do not aim here at discovering attacks on the protocol but our goal is to prove 
that there is not any, which is a more difficult problem. In practice, positive 
proofs of security properties on cryptographic protocols are highly desirable re- 
sults since they give a better guarantee on the reliability of the protocol than 
any amount of passed tests. In [9], a decidable approximation of the set of de- 
scendants (reachable terms) was presented. In this paper, we propose to apply 
those theoretical results to the verification of cryptographic protocols. Our case 
study is the Needham-Schroeder Public Key protocol [19] (NSPK for short). We 
chose this particular example for two reasons. First of all, this protocol is real 
but can be easily understood. The second reason is that, in spite of its appar- 
ent simplicity and robustness, and in spite of several verification attempts, this 
protocol designed in 1978 was proved insecure only in 1995 by G. Lowe [13] and 
in 1996 by G. Meadows [17]. In particular, G. Lowe found a smart attack inval- 
idating the main security properties of the protocol. In this paper, we will use 
the corrected version of the NSPK protocol also proposed by G. Lowe in [14]. 

Starting from a TRS representing the protocol and a tree automaton rec- 
ognizing the initial set of communication requests, we automatically compute 
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a superset of the set of exchanged messages by over-approximating the set of 
reachable terms. This model ~ also a tree automaton - takes into account an un- 
bounded number of parties, an unbounded number of interleaved sessions as well 
as a powerful intruder activity description. For building this model, we needed 
to extend the approximation technique of [9] , initially designed to approximate 
functional programs encoded by left-linear TRSs, to the more general class of 
TRSs (possibly non left-linear) with associative and commutative symbols. 

In section 1 , we recall basic definitions of terms, term rewriting systems, and 
tree automata. In section 2, we recall the technique for approximating the set of 
descendants for left-linear term rewriting systems and regular set of terms [9] . In 
section 3, we shortly present the Needham-Schroeder Public Key protocol, com- 
ment on its expected properties and propose an encoding into a term rewriting 
system in section 4. However, the term rewriting system describing the NSPK is 
not left-linear, has Associative and Commutative (AC for short) symbols and, 
consequently, is out of the scope of the basic approximation technique of [9]. 
Thus, in section 5, we show how to extend our technique to the case of non 
left-linear and AC TRSs. We also describe the application of approximation to 
NSPK and show how to prove confidentiality and authentication properties. Fi- 
nally, in section 6, we conclude, compare with other approaches and present 
ongoing developments. 

1 Preliminaries 

We now introduce some notations and basic definitions. Comprehensive surveys 
can be found in [7] for term rewriting systems, in [3] for tree automata and tree 
language theory, and in [11] for connections between regular tree languages and 
term rewriting systems. 

Terms, Substitutions, Rewriting Systems 

Let IF be a finite set of symbols associated with an arity function, A be a 
countable set of variables, T(iF, A) the set of terms, and T(iF) the set of ground 
terms (terms without variables). Positions in a term are represented as sequences 
of integers. The set of positions in a term t, denoted by Vos(t), is ordered by 
lexicographic ordering The empty sequence e denotes the top-most position. 
If p G Vos{t), then t\p denotes the subterm of t at position p and t[s]p denotes 
the term obtained by replacement of the subterm tjp at position p by the term s. 
For any term s G T(iF, A), we denote by Vosj^{s) the set of functional positions 
in s, i.e. {p G Vos{s) | p yf e and TZoot{s\p) G T} where TZoot{t) denotes the 
symbol at position e in t. A ground context is a term of T(iF U {□}) with exactly 
one occurrence of □, where □ is a special constant not occurring in IF. For any 
term t G T{T), C[t] denotes the term obtained after replacement of □ by t in the 
ground context C[]. The set of variables of a term t is denoted by Var{t). A term 
is linear if any variable of Var(t) has exactly one occurrence in t. A substitution 
is a mapping a from A into T(iF, A), which can uniquely be extended to an 
endomorphism of T(iF, A). Its domain Vom{a) \s {x & X \ xa ^ a;}. 
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A term rewriting system 7^ is a set of rewrite rules I — > r, where l,r G 
T{iF,X), I ^ X, and Var{l) D Var(r). A rewrite rule I ^ r is left-linear (resp. 
right-linear) if the left-hand side (resp. right-hand side) of the rule is linear. A 
rule is linear if it is both left and right-linear. A TRS TZ is linear (resp. left- 
linear, right-linear) if every rewrite rule ^ > r of 7^ is linear (resp. left-linear, 

right-linear) . 

The relation —> 7 ^ induced by TZ is defined as follows: for any s, t G T (iF, X), 
s — t if there exist a rule I ^ r in TZ, a position p G 'Pos{s) and a substitution 
a such that la = s|p and t = s[ra\p. The reflexive transitive closure of is 
denoted by The set of 7^-descendants of a set of ground terms E is denoted 
by TZ*{E) and TZ*{E) = {t G T{E) \ 3sGE s.t. s t}. 



Automata, Regular Tree Languages 

Let Q be a finite set of symbols, with arity 0, called states. T(lF U Q) is called the 
set of configurations. A transition is a rewrite rule c ^ q, where c G T{E U Q) 
and q G Q. A normalized transition is a transition c ^ q where c = q' G Q or 
c = /( 91 , • ■ ■ , <7n), f & TF, ar{f) = n, and qi,. . . ,q„ G Q. A bottom-up non- 
deterministic finite tree automaton (tree automaton for short) is a quadruple 
A = {J-, Q, Qf,A), where Qf A Q and A is a set of normalized transitions. A 
tree automaton is deterministic if there are no two rules with the same right hand 
side. The rewriting relation induced by A is denoted either by — or by 
The tree language recognized by A is C{A) = {t G T{F) \3q G Qf s.t. t — q}. 
For a given q G Q, the tree language recognized by A and q is C{A, q) = {t G 
E{T) I t q}. A tree language (or a set of terms) E is regular if there exists 
a bottom-up tree automaton A such that C{A) = E. The class of regular tree 
languages is closed under boolean operations U, fl, \, and inclusion is decidable. 
A Q-substitution is a substitution a ■. X ^ Q. Let E{Q,X) be the set of Q- 
substitutions. For every transition, there exists an equivalent set of normalized 
transitions. Normalization consists in decomposing a transition s q, into a 
set Norm{s — > q) of normalized transitions. The method consists in abstracting 
subterms s' of s s.t. s' ^ Q by states of Q. We first define the abstraction function 
as follows: 

Definition 1. Let F he a set of symbols, and Q a set of states. For a given 
configuration s G T {F U Q) \ Q, an abstraction of s is a mapping a: 

a : {s|p I p G 'Posj^{s)} ^ Q 

The mapping a is extended on T(lFU Q) by defining a as identity on Q, i.e. 
VqG Q: a{q) = q. 



Definition 2. Let F he a set of symbols, Q a set of states, s ^ q a transition 
s.t. s GF{Fyj Q) and q G Q, and a an abstraction of s. The set Norrria{s q) 
0 / normalized transitions is inductively defined by: 
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1 . if s = q, then Norma{s ^ q) = %, and 

2 . if s € Q and s q, then Norma{s ^ q) = {s ^ q\, and 

3 . if s = f(fi , . . . , tn), then Norma{s ^ q) = 

{f{a{ti), . . . ,a{tn)) ^ g} U \Jl^i Noririaiti a{ti)). 

Example 1 . Let T = {f,g,a} and A — {E, Q, Qf,A), where Q = {go, 9i, 92, 93, 
94}, Qf = {90}, and A = {/(91) ^ 9o,9(9i,9i) ^ 9i,a ^ 9i}- 

• The languages recognized by 91 and 90 are the following: £(^,91) is the 
set of terms built on {g,a}, i.e. £(^,91) = T{{g,a}), and £(^,90) = E{A) = 
{f{x) I X G £(^,91)}. 

• Let s = f{g{qi, f{a))), and a\ be an abstraction of s, mapping 9(91, /(a)) 

to 92, /(a) to 93 and a to 94. The normalization of transition f{g{qi, f{a))) 90 

with abstraction a\ is the following: Norma^{f{g{q\, f{a))) 90) = 1/(92) — *■ 

90,9(91,93) ^ 92,7(94) ^ qs,a^ 94}. 

2 Approximation Technique 

For a regular set of terms E CT (E) , although there exists some restricted classes 
of TRSs TZ such that TZ*{E) is regular (see [ 5 , 21 , 4 , 12 ]), this is not the case in 
general [ 11 , 12 ]. In [ 9 ], for any tree automaton A (s.t. C{A) A E) and for any left- 
linear TRS 7 ?,, it is proposed to build an approximation automaton EjA (A) such 
that £(77^T (A)) 3 TZ*{E). The quality of the approximation highly depends on 
an approximation function called 7 which define some folding positions: subterms 
who can be approximated. We now briefly recall the construction of T^t (A) [ 9 ]: 

Let 7 ^ be a left-linear term rewriting system and A = {E, Q,Qf,A) a tree 
automaton such that E = C{A) (or even E C C{A)). First, we infinitely extend 
the set of states Q of A with an infinite number of new states, initially not 
occurring in Q. Note that since we do not modify A nor Qf (in particular, they 
remain finite), the language recognized by A is the same. On the other hand, it 
is always possible to come back to a finite set of states for A by restricting Q to 
the set of accessible states, i.e. states 9 such that £(A, 9) yf 0. 

Starting from Ao = A, we incrementally build a finite number of tree au- 
tomata Ai = {E, Q, Qf,Ai) with z > 0 such that Vz > 0 : £(Az) C £(Az+i) 
until we get an automaton Ak with fc G N such that £(Afc) O 7 ^*(£(Ao)), i.e. 
£(Afc) O TZ*{E). We denote by T^t (A) this automaton Ak- To construct Az+i 
from Ai, the technique consists in finding a term s in £(Az) such that s t 

and t ^ £(Ai), and then in building such that £(Az) C £(Az+i) and 

t G £(Ai+i). 




Since Ai and A^+i only differs by their respective transitions sets, to ensure 
C{Ai) C £(Ai+i) it is enough to construct such that it strictly contains 





Rewriting for Cryptographic Protocol Verification 275 

Ai. In order to have also t G C{Ai+\) it is necessary to add some transitions 
to Ai to obtain Z\i+i. This can be viewed as a completion step between the 
two term rewriting systems: the set of transitions Ai of Ai and 7Z. If there 
exists a term s in £-(Ai) such that s — t, by definition of —pz, there exists 

a rule I ^ r, a, ground context C[] and a substitution (a match) a such that 
s = C[la\ — C[ra\ = t. On the other hand, by construction of tree automata, 

s = C[la] G C{Ai) means that (1) there exists a state q € Q such that la q 
and (2) C[q] q' such that q' G Q/. Hence, from (1) we know that we have 
following critical pair between transitions of Ai and rules of TZ: 

la ^ ra 

n 

* 

q 

Since every transition of Ai is in Ai+i (i.e. Ai C Z\i+i), for the term t to 
be recognized by Ai+i, it is enough to ensure that (3) ra q- This is 

sufficient since we can then rewrite t = C[ra] into C[q] and from (2) we get that 
C[q] ^^,+1 q', since Ai C Ai+i. Finally, since q' G Q/, t G C{Ai+i). 

To ensure (3), we need to add some transitions to 2ii+i, i.e. join the critical 
pair: 



la 

Ai 



q 



* 






> 



ra 

I 



/ 



* '' 

'' ^i+l 



A direct solution to have ra q is to have a transition of the form ra ^ q 

in Ai+i- However, this is not compatible with the standard normalized form of 
the tree automata we use here^ . Thus, before adding ra ^ q to transitions of 
Ai, we normalize it first thanks to the Norma function (see definition 2). Hence, 
Z\i+i = Ai\J N orma{ra — > q). We give here an example of completion process 
on a simple TRS 



Example 2. Let T = {f,g,a} and TZ the one rule TRS TZ = {f{g{x)) 
9{f{x))}- Let Ao = {T, Q, Qf,Ao) such that Qf = {qf} and Z\o = \f{qf) 
qf,g{qa) ^ qf,a^ qa}- We have C{Ao) = f*{g{a)). Between TZ and transitions 
of Ao there exists a critical pair: 



f{q{qa)) 



Ao 



n 



qf 



^qif(qa)) 



^ keeping tree automata in standard normalized form allows, in particular, to apply 
usual algorithms: intersection, union, etc. 
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The Q-substitution used here is ct = {a; i— > As defined before, we have 

Ai = AqU Norma(g(f(qa))) Qf)- Let a be the abstraction function such that 
tt(/((?a)) = qnew where qnew is a state not occurring in transitions of Ao- Then, 
we have Ai = AqU {g{qnew) qf, f{qa) qnew}- 

Except in some simple decidable case, this completion procedure is not guar- 
anteed to converge but, instead, may infinitely add new transitions and thus 
generate an infinite number of tree automata Ai, A2, etc. However, choosing 
particular values for a may force the completion process to converge by approx- 
imating infinitely many transitions by finite sets of more general transitions. 
Those particular abstraction functions are associated with approximation func- 
tions denoted by 7, defining some folding positions: positions in the right hand 
side of rules where subterms are approximated by regular languages: for each 
completion step from Ai to Ai+i involving a rewrite step la —7^ ra, a folding 

position p is a position in r which is assigned a state q' such that we only ensure 
£(Ai+i, g') T {ra\p} instead of strict equality: £{Ai+i,q') = {ra\p}. This comes 
from the fact that the same state q' can be used for recognizing different terms 
obtained by different positions, rules or substitutions. The role of the approxi- 
mation function is to relate ra\p and the state q' . Folding positions depend on 
the applied rule I — > r and on the substitution a. Furthermore, since in our 
setting a rewriting step s = C[la] C[ra] = t is modeled by a completion 
step on the critical pair la ra and la — q, q is also a parameter of the 
approximation function. Finally, the approximation function 7 maps every triple 
{I r, q, a) to a sequence of states (one for each position in Vosj^{r)) used for 
the normalization of the transition ra — > q. 

Definition 3 . Let Q he a set of states and Q* the set of sequences q\ - ■ • qu of 
states in Q. An approximation function is a mapping j : TZx QxE{Q, A) >—> Q* , 
such that ^{1 ^ r,q,a) = q\- ■ -qk, where k = Card{Vosj^{r)) . 

From every 7(1 ^ r,q,a) = q\ - ■ ■ qk, we can associate qi, ■ ■ ■ ,qu to positions 
Pi, ... ,Pfc in Vosj^{r). This can be done by defining the corresponding abstrac- 
tion function a on the restricted domain {ra\p \ \/l ^ r G TZ,\/p € Vosy^{r),\/a G 



for all Pi G Vosj^{r) = {pi, . . . ,pfc}, s.t. pi -< pi+i for i = 1 . . . fc — 1 (where ^ is 
the lexicographic ordering). In the following, we will note Norm-y the normal- 
ization function whose a. value is defined according to 7 as above. 

Starting from a left-linear TRS TZ, a tree automaton A and an approximation 
function 7, the algorithm for building the approximation automaton 'TjA (A) is 
the following. First, set Ao to A. Then, to construct Ai+\ from Ap. 

1 . search for a critical pair, i.e. a state q G Q, a, rewrite rule I — > r and a 
substitution a G E{Q, X) such that la — q and ra 9 - 

2 . Ai+i = AiU Normy{ra — > q). 
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This process is iterated until it stops on a tree automaton Ak such that Vq G Q, 
— > r G 7^ and Vct G S{Q, X) if la — q then ra q- Then, T-jA (^) = Ak- 
The fact that Q and X) may be infinite is not a problem in practice since, 
for finding a critical pair, we can restrict Q to the finite set of accessible states 
in Ai, without changing C{Ai) nor C{Ai+\). We now recall a theorem of [9]. 

Theorem 1. (Completeness) Given a tree automaton A and a left-linear TRS 
TZ, for any approximation function 7, 

C{TnnA))^n*{C{A)) 

The 7 function fix the quality of the approximation. For example, one of the 
roughest approximation is obtained with a constant 7 function mapping every 
triple {I r, a, q) to sequences of q' a unique state of Q: ^ r G TZ, Vcr G 

S{Q,X),yq G Q '■ ^{l ^ r,a,q) = q' ■ ■ ■ q' . On the opposite, the best approx- 
imation consists in mapping every triple {I — > r, a, q) to sequences of distinct 
states. However, although any rough approximation built with the first 7 is 
guaranteed to terminate, this is not necessarily the case for the second one. 

On a practical point of view, the fact that completeness of the approximation 
construction does not depend on the chosen 7 (Theorem 1) is a very interesting 
property. Indeed, it guarantees that for any approximation function, T^t (^) is 
a safe model of TZ*{E), in the sense of abstract interpretation. 

Example 3. Back to the example 2, adding to ^0 transitions {g{qnew) — *■ qf, 
fila) qnew} to obtain Ai brings another critical pair: 

/{Qilnew)) ~ ^ gifilnew)) 




‘If 

Like in the previous example, we build A2 by adding N orma{f {g{qnew)) — *■ qf) 
to Z\i. However, if a maps g{qnew) to another state q'^ew "^^t occurring in A\, we 
add some new transitions and get another critical pair, and the process may go on 
for ever. Instead, we can here define an approximation function 7 in a simple and 
static way, for example: Vcr G E{Q, X),'^q G Q : ^{f{g{x)) g{f{x)),q,a) = 

qnew Since 'Posj^{g{f{x))) = {1} is a singleton, note that the 7 function maps 
triple of the form {f{g{x)) g{f{x)),q, a) to sequences of states of length one. 

This 7 function defines a very rough approximation since the same state qnew is 
used for every normalization, whatever values q and a may be. Thanks to this 
approximation function 7, the completion terminates. The value of Z\i remain 
the same but, for the next completion step, we have Norm-f{f{g{qnew)) — *■ qf) = 
\_gi.gnew) ^ qfi fiqnew) ^ qnew)- Thus, Z\2 = A\ U )f{qnew) ^ qnew)-, there 
is no new critical pair between A2 and rule f{g{x)) — > g{f{x)), and we have 

£{A2) = r{g{r{a)))- 

Once TjA (Vl) is obtained, it is easy to verify some reachability properties 
on TZ and E. It can be shown for example that a regular set of terms E cannot 
be reached from terms of E by —gz*- This can be done by showing that C{T]A 

(Vl)) C\ E = ^. We will apply this to the verification of the Needham-Schroeder 
Public Key Protocol in section 5. 
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3 Needham-Schroeder Public Key Protocol 

In this section, we present our case study on the Needham-Schroeder Public 
Key protocol (NSPK). More precisely, we here use the fixed version of the pro- 
tocol [14] without key server. Key servers have been discarded here for the sake 
of simplicity. Note that attacks from [14] have been found on the NSPK without 
key servers. Moreover, the approximation technique have also been successfully 
applied to the protocol with key servers. 

The NSPK protocol aim at mutual authentication of two agents, an initiator 
A and a responder B, separated by an insecure network. Mutual authentication 
means that, when a protocol session is completed between two agents, they 
should be assured of each other’s identity. In general, the main property expected 
for this kind of protocol is to prevent an intruder from impersonating one of 
the two agents. This protocol is based on an exchange of nonces (usually fresh 
random numbers or time stamps) and on asymmetric encryption of messages: 
every agent has a public key (for encryption) and a private key (for decryption). 
Every public key is supposed to be known by any agent whereas, the private 
key of agent X is supposed to be only known by X. Thus, in this setting, we 
suppose that messages encrypted with the public key of X can only be decrypted 
and read by X. Here is a description of the three steps of the fixed version of 
protocol, borrowed from [14]: 

A^B-. {Na,A}ks 

2 . B^ A-.{Na,Nb,B}k^ 

3. A ^ B : {Nb}kb 

In the first step, A tries to initiate a communication with B: A creates a nonce Na 
and sends to i? a message, containing Na as well as his identity, encrypted with 
the public key of B: Kb - Then, in the second step, B sends back to H a message 
encrypted with the public key of A, containing the nonce Na that B received, a 
new nonce Nb, and H’s identity. Finally, in the last step, A returns the nonce 
Nb he received from B. If the protocol is completed, mutual authentication of 
the two agents is ensured: 

— as soon as A receives the message containing the nonce Na, sent back by 
B at step 2., A believes that this message was really built and sent by B. 
Indeed, Na was encrypted with the public key of B and, thus, B is the only 
agent that is able to send back Na, 

— similarly, when B receives the message containing the nonce Nb, sent back 
by A at step 3., B believes that this message was really built and sent by A. 

Another property that may be expected for this kind of protocol is confidentiality 
of nonces. In particular, if nonces remain confidential, they can be used later as 
keys for symmetric encryption of communications between A and B. Thus, 
confidentiality of nonces may also be of interest. 

A cryptographic protocol is supposed to resist to any attack of an intruder. In 
particular for NSPK, we intend to show that, for agents respecting the protocol, 
and whatever the intruder may do. 
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— nonces and private keys remain confidential (confidentiality), 

— if an agent X believes that a message was built by another agent Y, then 
the message was effectively built by Y (authentication). 

4 Encoding the Protocol and the Intruder 

In this section, we show how to model NSPK by a TRS. First, we present the 
signature T and the terms of T (iF) used for representing agents, messages, keys, 
etc. Each agent is labeled by a unique identifier, let Lagt be the set of agent labels 
(terms representing agent labels will be given later). For any agent label I € Lagt, 
the term agt(l) will denote the agent whose label is 1. The term mesg(x, y, c) will 
represent a message whose header refers agent x as emitter, agent y as receiver 
and whose contents is c. The term pubkey{a) denotes the public key of agent a 
and encr{k, a, c) denotes the result of encryption of content c by key k. In this 
last term, a is a flag recording who has performed the encryption. This field is 
not used by the protocol rules but will be used for verification. The term N(x, y) 
represents a nonce generated by agent x for identifying a communication with 
y. We also use an AC binary symbol U in order to represent sets. For example 
the term xU{yUz) (equivalent modulo AC to (a; U y) U z) will represent the set 

{x,y, 21 - 

Starting from a set of initial requests, our aim is to compute a tree automa- 
ton recognizing an over-approximation of all sent messages. The approximation 
also contains some terms signaling either communication requests or established 
communications. For example, a term of the form goal(x, y) means that x expect 
to open a communication with y. A term of the form cJ,nit{x, y, z) means that 
X believes to have initiated a communication with y, but, in reality x commu- 
nicates with z. Conversely, a term cjresp(y, x, z) means that y believes to have 
responded to a communication request coming from x but z is the real author 
of the request. 

Then, encoding of the protocol into AC rewrite rules^ is straightforward: 
each step of the protocol is described thanks to a rewrite rule whose left-hand 
side is a precondition on the current state (set of received messages and com- 
munication requests), and the right-hand side represents the message to be sent 
(and sometimes established communication) if the precondition is met. The sent 
message is added to the current state. As a result, every rewrite rule we use is 
a ‘cumulative rule’, i.e. of the form I ^ I Li X. Thus, for commodity, we choose 
to use the short-hand LHS for the term I occurring in the right-hand side. For 
instance, the rule mesg{x,y,c) LHS U cJnit{x,y,y) will represent the rule: 
mesg{x, y, c) mesg{x, y, c) U cJnit{x, y, y). Now for each step of the protocol, 

^ We describe here our encoding in a general way. However, for the particular case 
of NSPK, encoding could have be done without the AC-symbol U, since U is only 
needed when the sending of a message depends on the reception of two (or more) 
distinct messages, i.e. rules of the form: mi U m 2 ^ m 3 . In general, those rules 
are necessary to modelize protocols, but it is not the case for this simple version of 
NSPK. 
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we give the corresponding rewrite rule. The encoding into TRS is longer than 
the initial protocol specification of section 3 because it is more complete. For 
instance, whereas the initial specification only informally define how to check 
the content of messages and how to deal with communication requests, these 
points are formally defined in our specification with rewrite rules. Furthermore, 
the initial specification can be viewed as a trace of a correct execution of the 
NSPK protocol for two specific agents A and B. Thus, this specification cannot 
be directly used in a more general context where some other agents also use 
the protocol. Hence, another difference between our specification and the initial 
specification of section 3 is that agents’ identities of initial specification {A and 
B) have been abstracted by term with variables of the form agt{x), agt{y). In 
the following, x, y, z, u, v, x2, x3 and z2 are supposed to be variables since we 
consider an unbounded number of agents and transactions. 

1. A ^ B : {Na, A}kb- The emission of the first message is encoded by the 
rule: 

goal{x,y) LHS U mesg{x,y,encr{pubkey{y),x,[N{x,y),xJ)) 

The meaning of this rule is the following: if an agent x wants to establish 
a communication with y then x sends a message to y whose contents is en- 
crypted with public key of y. The contents is here represented by a list (build 
with classical operators cons and null) containing a nonce N(x, y) produced 
by X for y as well as x’s identity. For commodity, lists will be represented in 
the usual way, for example a list of the form cons{u,cons{v,null)) will be 
denoted by [u, u]. 

2 . B^A:{Na,Nb,B}k^. 

mesg{x, agt{u), encr{pubkey{agt{u)) , z, [u, agt(a;2)])) — > 

LHS U mesg{agt{u) , agt{x2) , encr{pubkey{agt{x2)) , agt{u) , 

[u, N{agt[u),agt{x2)),agt{u)Y)) 

The second message is sent by an agent agt{u) when he receives the first 
message from an agent agt{x2) whose identity is enclosed in the message^. 
Note that in those rules, we achieve some kind of type checking on the content 
of the message. For instance, in the left-hand side of this rule, by expecting 
the message content pattern [u, agt(x2)J instead of a more general pattern 
like [u, a;3], we check that this element of the message is an agent’s identity. 
The role of this kind of type checking is important since it permits to avoid 
some attacks based on type confusion like those described in [17]. 

3. A'^ B : {Nb}kb- This step is encoded by the rule: 

mesg{x, agt{y),encr{pubkey{agt{y)), z2, [N{agt{y),agt{z)),u, agt{z)j)) 

LHS U mesg{agt{y) , agt(z), encr{pubkey{agt{z)) , [u])) 

U cJnit{agt{y) , agt{z) , z2) 

® In this protocol, agent’s identity contained in the header of the message (x in our 
example) is never used, since it may have been corrupted by an intruder. However, 
this information is sometimes used, for example in the extended version of NSPK 
where a key server is also involved. 
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When agent agt{y) receives from agt{z) the nonce N {agt{y) , agt{z)) he has 
built for agt{z) then he performs two actions. The first action is to send the 
last protocol message to agt{z). The second action consists in reporting the 
communication agt{y) thinks to have established with agt{z). However, the 
reality may be different and the identity of the real author of the message, 
z2, is used for filling the third field of the cJnit term. 

4. In the last step of the protocol, no message is sent but when an agent receives 
the last message of the protocol sent at step 3., he reports a communication 
where he has the responder role. 

mesg{x, agt{y),encr{pubkey{agt{y)), z2, [fV(a 5 f(y), 2 )])) ^ 

LHS U cjresp{agt{y), z, z2) 

To prove the authentication property on the protocol, we need to prove that any 
couple of agents can securely establish a communication through the network, 
whatever the behavior of other agents and the behavior of an intruder may be. 
Thus, we assume that there is an unbounded number of agent labels in Lagt but 
we will observe more precisely two agents, namely agents labeled by A and B. 
For the unbounded number of other agent labels we will use integers built on 
usual operators 0 and s (successor). Hence, Lagt = {A, H}UN and the initial set 
of terms E is the set of terms of the form goal{agt{x) , agt{y)) where x,y G Lagt- 
In other words, E is the set of all communication requests 

— from A or B towards any other agent agt(i) with z € N, and 

— from agt{i) with z G N towards A or B, and 

— from any agent agt{i) to any agent agt{j), i,j G N, and 

— from A to B, B to A, A to A and B to B. 

Note that we work in a very general setting where we also take into account the 
case where an agent use the protocol to authenticate himself. It is clear that self- 
authentication of an agent may be not of practical interest, but, if it happens we 
want to verify that the intruder cannot take advantage of it to build an attack. 
The set E is recognized by the following tree automaton ^o- The final state of 
^0 is qnet and here is the set of transitions: 



0 - 


qint 


agt{qB) ^ 


‘ QagtB 


QOdl{^QagtAi Qagtl) ~ 


Qnet 


s{qint) 


qint 


Qnet 1—1 Qnet 


Qnet 


QOdl(^QagtI ■} QagtA) ~ 


Qnet 


A 


qA 


QOdli^QagtA-) Qagts) 


Qnet 


QOdli^QagtB •) Qagtl) ~ 


Qnet 


B 


qs 


QOdli^QagtB •) QagtA) 


Qnet 


QOdli^Qagtl •) QagtB) ~ 


Qnet 


agt(qint) ^ 


qagtl 


QOdli^QagtAi QagtA) 


Qnet 


QOdli^Qagtl •) Qagtl) ~ 


Qnet 


agt{qA) 


qagtA 


QOdli^QagtB •) Qagts) 


Qnet 







Description of the Intruder 

In this last automaton, the state qnet is a special state representing both the 
network and the fact base containing communication requests and communica- 
tion reports. As in many other verification approach of cryptographic protocols, 




282 Thomas Genet and Francis Klay 



the intruder is supposed to have a total control on the network. In particular, 
the intruder is assumed to know every message sent on the network. In our ap- 
proach this assumption is a bit stronger: the intruder is the network. A direct 
consequence of this choice is that the knowledge of the intruder and every mes- 
sage that the intruder can build is supposed to always remain on the network. 
Furthermore, we suppose that agents agt{i) with z S N (i.e. every agent that is 
not A or B) may be dishonest and deliberately give to the intruder their private 
key as well as the content of any message they send or receive. The intruder can 
also disassemble messages or build new ones from his knowledge. Rewrite rules 
are the simplest way to describe how an intruder can decrypt or disassemble 
components of a message. Since the agents agt{i) with z G N are fool enough to 
give their private keys to the intruder, he can decrypt the messages encrypted 
with their public keys. On the opposite, we assume that the intruder has no 
means of guessing the private key of A or B. Here are the corresponding rules 
which can be applied on the AC-term representing the network, i.e. the intruder 
knowledge: 

cons{x, y) U 2 — > LHS U x 
cons{x, y) U 2 — > LHS U y 
mesg{x, y,z)Uu ^ LHS U z 
encr{pubkey{agt{Q)) , y, z) U zz — > LHS U z 
encr{pubkey{agt{s{x))) , y,z)Uu ^ LHS U z 

On the other hand, intruder’s ability to build new messages from its knowledge 
is shortly defined thanks to some tree automaton transitions. Since y„ez is the 
state of Ao recognizing all the messages on the network, and since in our setting 
the knowledge of the intruder is the network, qnet is also the state recognizing 
the knowledge of the intruder. First, we assume that the intruder knows the 
identity of every agent of the network, as well as their public keys. 

agt{qint) qnet agt{qA) qnet agt{qB) qnet 

pubkey{qagti) qnet pubkey{qagtA) qnet pubkey(qagtB) qnet 

Agents agt(i) with z G N give the intruder the nonces they produce for other 
agents: 



/ * Disassembling * / 
/ * Decrypting * / 



^ {qagtl T qagtA^ ^ qnet ^ i^qagtl t qagtB^ ^ qnet ^ {qagtl t qagtl) ^ qnet 

Finally, starting from components he already knows or will obtain later (i.e. 
terms in qnet), the intruder can combine them into lists with the cons operator, 
encrypt them with anything (including keys) he knows with operator encr, build 
messages with operator mesg, etc. in order to enrich his knowledge (the language 
recognized by qnet)- Note, however, that the second field of the operator encr 
(which is a flag) cannot be corrupted by the intruder and always refer to qagti 
the real author of the encryption, i.e. the intruder. 

con si^qnet, qnet) ^ qnet null > qnet ozzcr(y^ei , y^ei) t qnet 
mCSgl^qnet, qnet, qnet) ^ qnet 
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There are several things to notice here. First, the initial description of C{Ao, qnet) 
is as wide and loose as possible: roughly, it authorizes the intruder to build nearly 
every term of except terms containing nonces built by A or B, i.e. terms 

containing subterms of the form N {agt{A) , agt{x)) or N (agt{B) , agt{y)) . This 
can be automatically obtained by a complement operation. This kind of specifi- 
cation is quite natural with regards to intruder description since it is much more 
simpler and more convincing to specify what cannot be built by the intruder than 
to precisely and totally define what he can do. Consequently, the language rec- 
ognized by state qnet is loose and it may also contain strangely formed messages 
whose effect on the protocol can hardly be predicted, for example: 

mesg{agt{A) , agt{B), encr{pubkey{agt{B)) , agt{Q), 
[encr{pubkey{agt{A)) , agt{Q), 

[N{agt{ 0 ),agt{A))D, Nlagt{ 0 ) , agt{B))})) 

i.e. a message of the form agt{A) ^ agt{B) : {{^agi(o)}iCagt(^), ^agi(o)}K„„t(s)- 
The language recognized by qnet contains also, for instance, terms representing 
repeated encryption (an unbound number) which are important to consider for 
cryptographic protocols verification: 

encr{pubkey{agt{A)) , agt{s{Q)), encr{pubkey{agt{B)) , agt{ 0 ), encr{. . . 

The last thing to remark here is that during approximation construction, new 
messages or messages components m obtained by rewriting are added to the 
language recognized by automaton At as new transitions into Ai+i s.t. m 
qnet and thus can be used ’dynamically’ as new base components for intruder’s 
message constructions. 

To sum up, we have here described a model where we consider an unbounded 
number of agents executing an unbounded number of protocol sessions in par- 
allel. In particular, note that if there exists an attack based on parallel protocol 
sessions between, say four agents A, B, C and D, this attack will appear in the 
model: C and D can be represented by two ’dishonest’ agents, say agt{i) and 
with z, j G N and z yf j since all ’dishonest’ agents are able to respect the 
protocol. 

5 Approximation and Verification 

Extensions of Approximations to AC non Left-Linear TRSs 

In this section, we show how to extend the approximation construction to this 
larger class of TRSs. Roughly, the problem with non left-linear rules is the fol- 
lowing: let f{x, x) g(x) be a rule of 7 Z and let A be a tree automaton whose 
set of transitions contains f{qi,qi) — > qo and /((72,93) — *■ qo- Although we can 
construct a valid substitution a = {x qi} for matching the rewrite rule on 
the first transition, it is not the case for the second one. The semantics of a 
completion between rule f{x,x) g{x) and transition f{q2,q3) — *■ qo would be 
to find the common language of terms recognized both by q2 and (73. This can 
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be obtained by computing a new tree automaton A' with a set of states Q' such 
that Q' is disjoint from states of A and G Q' : C{A' , q) = C{A^ ( 72 ) H C{A, qs). 
Then, to end the completion step it would be enough to add transitions of A' 
to A with the new transition g{q) — > qg. However, adding transitions of A! to A 
also adds Q! to states of A. Thus, we add new states to A and, in some cases, 
this may lead to non-termination of the approximation construction. 

On the other hand, one can remark that the non-linearity problem would 
disappear with deterministic automata since for any deterministic automaton 
Adet and for all states q, q' of Adet we trivially have C{A^ q) n C{A^ q') = 0. 
However, determinization of a tree automaton may result in an exponential blow- 
up of the number of states [3]. Thus, we chose here to use locally deterministic 
tree automata: non-deterministic tree automata with some deterministic states, 
i.e. states q such that there is no two rules t ^ q and t ^ q' with q yf q' . 
Hence, for all deterministic state q, we have Vg' yf q : C{A, q) n C{A, q') = 0. 
During the approximation construction, if all states, matched by a non-linear 
variable of the left-hand side of a rule, are deterministic then it is enough to 
build critical pairs where non linear variables of the left-hand side are mapped 
to the same state. For instance, in the last example, it is enough to build the 
first critical pair, add the transition g{qi) — > qo, and keep q 2 ,q 3 deterministic, 
i.e. such that (A) , q 2 ) C\ £{TjA (-4), 93 ) = 0. We now show the completeness 

of this algorithm on locally deterministic tree automata. 

For all term t non linear, let us denote by tun the term t linearized, i.e. 
where all occurrences of non linear variables are replaced by disjoint variables. 
For example, if t = f{x, y, g{x, a;)), then tun = f{x', y, g{x" , x'")). 

Definition 4. (States Matching) Let A he a tree automaton, Q its set of states, 
t G T{T,X) a non linear term, and {p\, . . . ,pn} C Vos{t) the set of positions 
of a non linear variable x in t. We say that states qi, ■ ■ ■ ,qn & Q are matched 
by X iff 3a G S{Q, X) s.t. tu^a g G Q, and =qi,... , bi„(T|p„ = 

Theorem 2. (Completeness Extended to non Left-Linear TRS) Let Abe a tree 
automaton, TZ a TRS, Tjyf (^) the corresponding approximation automaton and 
Q its set of states. For all non left-linear rule I ^ r G TZ, for all non linear 
variable x of I, for all states qi, ■ ■ ■ , <jn G Q matched by x, if either qi = . . . = q„ 
or £{Tjz( (^), gi) n . . . n £{Tjz( {A),qn) = 0 then 

£{TnnA))^TZ*{£{A)) 

Proof. (Sketch) (See [10] for a detailed proof) Assume that there exists a term t 
such that t G TZ*{£{A)) and t ^ £{7)^ (A)). Let s G £{A) such that s — t. 
On the rewrite chain from s to t, let ti,t 2 be the first two terms such that 
ti G £{TtA (^))> ti h and t 2 ^ £{Tv)\ (A)). We then show that the rule 
I —> r G TZ applied for rewriting ti into t 2 is necessarily a non left-linear rule 
(otherwise t 2 would be in £{T-R)i (A))). Then, we obtain that there exists a 
subterm u of ti matched by all occurrences of a non linear variable x in I and 
there exists at least two distinct states q, q' of TjA (A) such that u q 

and u q' ■ This contradicts the hypothesis of the theorem since q and q' 

are matched by a; in ^ and £{TtA (A),q) n £{T-Rf {A),q') £ {u} yf 0. □ 
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In our framework, states matched by non-linear variables are easily kept de- 
terministic. For example, in the NSPK specification, non-linear variables always 
match terms A, B, i € N (representing agent labels) which are initially recog- 
nized by qA, Qb, Qinu respectively. Those states are initially deterministic and 
this property is trivially preserved during completion since agent labels do not 
occur in right-hand side of rules and thus agent labels do not occur in new tran- 
sitions to be added. However, when necessary, we can also automatically check 
this property on T-ji] (^) by proving that C{TtA\ {A),qi)f ]. . .n£(7R,t (-4), g„) = 0, 
for each non linear variable a; of a rule matching distinct states qi, ■ ■ ■ , qn- 

For dealing with the AC symbols, the extension is straightforward. Since 
approximation can deal with non terminating TRS, we can explicitly define the 
AC-behavior of a symbol. Thus, we replace in T the (implicit) AC-symbol U by a 
non- AC symbol U and add to 7^ the following left-linear rules defining explicitly 
the AC behavior of U: 

x\J y ^ y\J X {x\J y)\] z ^ x\J {y\J z) x\J {y\J z) ^ {x\J y)\] z 

Approximation Function 

Let TZ and Ao be respectively the set of all rewrite rules and the tree automaton 
given above. Our aim is now to compute a tree automaton (Ao) recognizing 
a superset of TZ*{C{Ao)) and thus, to over-approximate the network, i.e. the set 
of all possible sent messages (as well as the set of communication reports). We 
now give the approximation function 7 , defining the folding positions for TZ and 
Ao- For approximation, the first choice we have made is to confuse dishonest 
agents (agt{i) with z G N) together. In other words, in our approximation, no 
difference is made between agents agt{i) and agt(j) for any z, j G N. However, 
we still distinguish between agt(A), agt{B) and any agent agt{i) with z G N. 
In a similar manner, we collapse together all the messages sent and received by 
dishonest agents but we still do not confuse messages involving agt{A) or agt{B). 
For example, the approximation function used for the rule 0, i.e. 

goal{x, y) — > LHS U mesg{x, y, encr{pubkey{y) , x, [IV(a;, y), a;])) 



is such that there are only seven distinct values for 7 (The detail of sequences 
of new states used for each value can be found in [ 10 ] with the complete specifi- 
cation.): 





7(®7 Qnet, {x ^ qagtA, y ^ (lagts}) 




7(®7 Qnet, {x ^ QagtB, 2/ QagtA}) 


m 


7(®, qneU {a; qagtA, 2/ QagtA}) 


m 


7(®7 Qnet, {a; QagtB, 2/ QagtB}) 


□ 


7(®7 Qnet, {X 1-^ qagtl, 2/ QagtA}) 




7(®7 Qnet, {a; Qagtl, 2/ QagtB}) 




7(®, QnetAV'-^ Qagtl}) 





According to case (i) all messages generated thanks to rule 1, where x is the agent 
labeled by A and y is the agent labeled by B, are decomposed using the same 
states defined by the sequence 7(®, g„ez,{a; qagtA,y ^ gagts})- Similarly, 
the case (vii) means that all messages generated thanks to rule 1 , where y is an 
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agent labeled by z G N and x is any agent, are decomposed using states of the 
same sequence 7 (@, gnet, {y '— *■ gagti})- Thus, no difference is made, for example, 
between messages sent by agt{A) to agt{i), messages sent by agt{B) to agt{j), 
and messages sent for by agt{i) to agt{j) for any z, j G N. This is in fact natural 
since all messages sent to a dishonest agent are captured and factorized by the 
same intruder. 



Verification 



We use a prototype, based on a tree automata library [9,8] developed in ELAN 
[2] , which permits to automatically compute approximations for a given TZ, Ao 
and an approximation function 7. Thanks to the approximation function given 
above, we obtain a finite tree automaton T^t (Ao), with about 130 states and 
340 transitions, recognizing a regular superset of TZ*{£{Ao)). See [10] for the 
complete specification and for a complete listing of the automaton T-jA (Ao). 

Thanks to this automaton, we can directly verify that NSPK has the confi- 
dentiality and authentication property. For confidentiality, it is enough to verify 
that the intruder cannot capture a nonce of the form N (agt(x) , agt{y)) where 
X, y G {A, B}. Since in our model the intruder emits all his knowledge on the net- 
work (as explained in section 4), this can be done by checking that the intruder 
cannot emit a nonce of the form N {agt{A) , agt{B)) , N (agt{B) , agt{A)) , . . . i.e. 
that the intersection between TjA (Ao) and the automaton Aconf is empty. The 
final state of Aconf is gnet and its transitions are: 



-^qA 




agt{qB)^ 


^ QagtB 


^ i^gagtA-; QagtA^ ~ 


gnet 


— 


N{qc 


igtAi QagtB^ ' 


Qnet 


^ {gagtB 7 gagtB^ ~ 


gnet 


QagtA 


N{qc 


igtB 7 gagtA) ' 


Qnet 


gnet U gnet ~ 


gnet 



The intersection can be automatically computed and we obtain a tree automaton 
whose set of states is empty, i.e. the recognized language is empty. Hence, there 
is no term of £{Aconf) in £{'£jA (Ao)) nor in 7l*{£{Ao))- Similarly, the cases 
where authentication is corrupted can be described by the following automaton 
Aaut whose final state is gnet and transitions are: 



0 ^ gtnt 

si^Qint) ^ Qint 

A^qa 

^ gagti 
agt(qA) ^ QagtA 
^ QagtB 
Qnet U gnet ^ gnet 



C-initi^gagtA^ gagtB^ gagti) 
C-iTtit)gagtA : QagtB: gagtA) 
C-T€Sp)gagtB , gagtA , gagti ) 
C-V€Sp)gagtB , gagtA , gagtB ) 
C-iflit)gagtB ^ gagtA: gagti) 
C-iTtit)gagtB : QagtA: QagtB) 
C-T€Sp)gagtA , gagtB , gagti ) 
C-T€Sp)gagtA: gagtB: gagtA) 



gnet C-init)gagtA: gagtA: gagti) 
gnet C-V€Sp)gagt A: gagtA: gagti) 
gnet C-iTlit)gagtA: gagtA: gagtB) 
gnet C-V€Sp)gagt A: gagtA: gagtB) 
gnet C-iTlit)gagtB , QagtB , gagti ) 
gnet C-V€Sp)gagtB : gagtB : Qagtl) 

gnet C-init)gagtB : gagtB : gagtA) 
gnet C-TCSp)gagtB : gagtB : gagtA) 



gnet 

gnet 

gnet 

gnet 

gnet 

gnet 

gnet 

gnet 



encoding all the cases where there is a distortion in communication reports 
between the belief of the parties and the reality, for example terms of the form 
cJnit{agt{A) , agt{B) , agt{k)) for fc G N U {A} meaning that agt{A) think to 
have established a communication with B but, in reality, he has been fooled and 
he communicates with some agt(i) with z G N or with himself. The intersection 
between TjA (Ao) and the automaton Aaut is also empty (see [10] for traces of 
execution) . 
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6 Conclusion 

In this paper, we have shown an application of descendant approximation to 
cryptographic protocols verification. We have obtained a positive proof of au- 
thentication and confidentiality of NSPK. Moreover, applying the same approx- 
imation mechanism on the flawed NSPK specification of [19] has led to some 
non-empty intersections with Aconf and Aaut, signaling violation of confiden- 
tiality and authentication properties. 

An interesting aspect of this method is that it takes advantage of theorem 
proving and a form of abstract interpretation called approximation. The basic 
deduction mechanism, coming from the domain of theorem proving, provide some 
simple and efficient tools - tree automata - to manipulate infinite objects. On 
the other hand, approximation simplifies the proof in such a way that it can be 
automatically computed afterwards. 

Compared to other rewriting based verification techniques like proofs by con- 
sistency or proofs by induction, properties that can be proved with the approxi- 
mation technique are clearly more restricted: they could be qualified as ‘regular 
properties’. However, by restricting attention to ’regular properties’, we obtain 
a verification technique that enjoys many interesting practical properties: ter- 
mination of the TRS is not needed, TRS may include AC symbols, proofs are 
obtained by intersections with T-jA (Ao) (automatically and quickly computed), 
construction of T-jA (Ao) is automatic, incremental and can be guaranteed to ter- 
minate by a good choice of the 7 approximation function (like in the NSPK case 
above or in a fully automatic way like in [9]). Constructing an approximation 
function does not require any particular skill in formal proof since it only con- 
sists in pointing out some sets of objects (represented here by states recognizing 
regular sets of terms) to be merged together in order to build an approximated 
model. In the NSPK case, the 7 approximation has been entirely given by hand 
but it is systematic: for each distinct value of the co-domain of 7 the user has to 
give a sequence of fresh states used for normalizing new transitions. For histor- 
ical reasons, this step is manual in our prototype but will be automated in the 
new implementation of this tool which is in progress. 

We can also compare this technique with other verification techniques used 
for verifying cryptographic protocols. The first main difference to be pointed out 
is that our technique is not designed for discovering attacks. From approximation 
TfA (Ao), we can derive some information on the context of those attacks but it 
is approximate and should be studied with a theorem prover or a model-checker 
to re-construct an exact trace of the attack. Model-checking is, in fact, particu- 
larly well suited for attack discovery as showed by the many flaws discovered by 
G. Lowe [15]. Furthermore, when attacks are no longer found, model-checking 
can also be used to verify cryptographic protocols by lifting the properties proved 
on a finite domain to an unbounded. However, the lifting has to be done by hand 
like G. Lowe did in [14] or, in a more automatic way, by abstract interpretation 
like it is done by D. Bolignano in [1] . Although we started with a different formal- 
ism and used a different technique, our approach is very close to D. Bolignano’s 
one. In particular, approximation functions can be seen as particular abstract 
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interpretations. Nevertheless, approximations enjoys a property that abstract in- 
terpretations have not in general: safety of our abstract model (approximations) 
is implicit and guaranteed by Theorem 1 for every approximation function 7. 

Automated theorem proving has also been widely used for cryptographic pro- 
tocols verification. The NRL Protocol Analyzer, developed by C. Meadows [17], 
uses narrowing. L. Paulson applied induction proof and the theorem prover Is- 
abelle/HOL to the verification of cryptographic protocols [20]. Those two the- 
orem proving approaches achieve a very detailed verification of protocols. In 
particular, they provide one of the most convincing answer to the problem of 
freshness. In counterpart, the proofs may diverge and the main difficulty remain 
to inject the right lemma at the right moment in order to make the proof con- 
verge. Thus, automation of this kind of method remains partial. Furthermore, 
proofs are long, complex and they require a user with a strong practical ex- 
perience of the prover. A more recent work is due to C. Weidenbach [22] who 
gave a positive proof for the Neuman-Stubblebine protocol thanks to the theo- 
rem prover SPASS. His technique is based on saturation of sets of horn clauses, 
which is related to the descendant computation we here use. For a restricted 
class of clauses called semi-linear, saturation can be computed exactly. However, 
when the protocol specification cannot be encoded into semi-linear clauses the 
saturation process may diverge. Thus, specifications must be modified in order 
to ensure termination of the process. In our framework, no restriction is set on 
the TRSs we use but, instead, we defined an over-approximation technique in 
order to tackle the divergence problem. 

In [6], G. Denker, J. Meseguer and C. Talcott proposed to encode the NSPK 
into object-oriented TRSs. This encoding is executable and is used for detecting 
attacks in the initial version of the protocol by testing. Using objects is clearly a 
great advantage for a better clarity and readability of the encoding. Nevertheless, 
since rewriting remains the operational model of object oriented rewriting, it 
should be possible to extend approximations to objects and thus benefit of the 
clarity of object oriented specifications. 

In [18], D. Monniaux also use tree automata and a completion mechanism 
for verifying cryptographic protocols. With regards to our work, an important 
difference is that his method can only deal with a bounded number of agents 
and a bounded number of protocol sessions. On a more technical point of view, 
unlike our approach, rewriting is only used for estimating intruder knowledge 
and not for encoding the protocol itself. Moreover, his completion mechanism is 
limited to the decidable and well known case of collapsing rules^ covered by the 
decidable and more general case of right-linear and monadic rules [21]. However, 
this approach is interesting since it shows a possible way for combining tree 
automata and state-transition models for abstract interpretation of protocols: 
tree automata and completion for abstracting structures and state-transition 
models for representing the notion of time in the abstract model. 



4 



right-hand side of a collapsing rule is reducted to a variable occurring in its left-hand 
side. 
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In the approximation model we consider for NSPK, time is totally collapsed, 
i.e. every message is considered to be permanently sent and received at every 
moment. Collapsing time let us easily consider an infinite number of protocol 
sessions in a finite model. Although this does not raise problems for proving 
confidentiality or authentication properties on NSPK, this is not the case in 
general. For instance, in electronic commerce protocols like SET [16], there is 
little hope to prove any security property on an abstract model with no time 
since freshness plays a central role. A direct solution is to consider several states 
for the network (i.e. of intruder knowledge) for different steps of the protocol 
instead of collapsing all states in one. 

Our main goal is to be able to handle protocol as complex as SET. To achieve 
this goal, the first thing to consider is to formally define the concepts used in 
cryptographic protocols (keys, nonces, agents, ... ) in order to get a natural 
protocol language description and an automatic translator to the encoding pre- 
sented in this paper. The second point would be, on the one hand, to extend 
the present work with conditional rules in order to get a more powerful behavior 
description language and, on the other hand, to handle other tree grammars to 
get finer approximations. 

Finally, we think that approximations could be used for the verification of 
systems different from cryptographic protocols. Rewriting based approximations 
seems to be a way to combine, in the same formalism, automated theorem prov- 
ing techniques and abstract interpretation: theorem proving for proving proper- 
ties needing high level proof techniques - like induction - and approximations 
for proving the remaining parts of the proof where abstract interpretation and 
model-checking are enough. 

Acknowledgments 

We would like to thank Pascal Brisset for discussion about cryptographic pro- 
tocols and Pierre-Etienne Moreau for technical help with ELAN. 

References 

1. D. Bolignano. Towards a Mechanization of Cryptographic Protocol Verification. 
In Proc. 9th CAV Conf., Haifa (Israel), volume 1254 of LNCS. Springer- Verlag, 
1997. 

2. P. Borovansky, C. Kirchner, H. Kirchner, P.-E. Moreau, and M. Vittek. ELAN: A 
logical framework based on computational systems. In Proc. 1st WRLA, volume 4 
of ENTCS, Asilomar (California), 1996. 

3. H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Ti- 
son, and M. Tommasi. Tree automata techniques and applications, 
http : / /13ux02 .univ-lille3 .f r/tata/, 1997. 

4. J. Coquide, M. Dauchet, R. Gilleron, and S. Vagvolgyi. Bottom-up tree pushdown 
automata and rewrite systems. In R. V. Book, editor, Proc. fth RTA Conf., Como 
(Italy), volume 488 of LNCS, pages 287-298. Springer- Verlag, 1991. 

5. M. Dauchet and S. Tison. The theory of ground rewrite systems is decidable. In 
Proc. 5th Lies Symp., Philadelphia (Pa., USA), pages 242-248, June 1990. 




290 Thomas Genet and Francis Klay 



6. G. Denker, J. Meseguer, and C. Talcott. Protocol Specification and Analysis in 
Maude. In Proc. 2nd WRLA Workshop, Pont d Mousson (France), 1998. 

7. N. Dershowitz and J.-P. Jouannaud. Pfandbook of Theoretical Computer Seienee, 
volume B, chapter 6: Rewrite Systems, pages 244-320. Elsevier Science Publishers 
B. V. (North-Holland), 1990. Also as: Research report 478, LRI. 

8. T. Genet. Tree Automata Library. http://www.loria.fr/ELAN/. 

9. T. Genet. Decidable approximations of sets of descendants and sets of normal 
forms. In Proc. 9th RTA Conf., Tsukuba (Japan), volume 1379 of LNCS, pages 
151-165. Springer- Verlag, 1998. 

10. T. Genet and F. Klay. Rewriting for cryptographic protocols 

verification (extended version). Technical report, INRIA, 2000. 

http : //www. irisa.fr/lande/genet/publications . html. 

11. R. Gilleron and S. Tison. Regular tree languages and rewrite systems. Fundamenta 
Informaticae, 24:157-175, 1995. 

12. F. Jacquemard. Decidable approximations of term rewriting systems. In 
H. Ganzinger, editor, Proc. 7th RTA Conf., New Brunswick (New Jersey, USA), 
pages 362-376. Springer- Verlag, 1996. 

13. G. Lowe. An Attack on the Needham-Schroder Public-Key Protocol. IPL, 56:131- 
133, 1995. 

14. G. Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using 
CSP and FDR. In Proc. 2nd TACAS Conf., Passau (Germany), volume 1055 of 
LNCS, pages 147-166. Springer- Verlag, 1996. 

15. G. Lowe. Some New Attacks upon Security Protocols. In 9th Computer Security 
Foundations Workshop. IEEE Computer Society Press, 1996. 

16. Mastercard & Visa. Secure Electronic Transactions, http://www.visa.com/set/, 
1996. 

17. C. A. Meadows. Analyzing the Needham-Schroeder Public Key Protocol: A com- 
parison of two approaches. In Proc. 4th ESORICS Symp., Rome (Italy), volume 
1146 of LNCS, pages 351-364. Springer- Verlag, 1996. 

18. D. Monniaux. Abstracting Cryptographic Protocols with Tree Automata. In Proc. 
6th SAS, Venezia (Italy), 1999. 

19. R. M. Needham and M. D. Schroeder. Using Encryption for Authentication in 
Large Networks of Computers. CACM, 21(12):993-999, 1978. 

20. L. Paulson. Proving Properties of Security Protocols by Induction. In 10th Com- 
puter Security Foundations Workshop. IEEE Computer Society Press, 1997. 

21. K. Salomaa. Deterministic Tree Pushdown Automata and Monadic Tree Rewriting 
Systems. J. of Computer and System Sciences, 37:367-394:, 1988. 

22. C. Weidenbach. Towards an Automatic Analysis of Security Protocols. In Proc. 
16th CADE Conf., Trento, (Italy), volume 1632 of LNAI, pages 378-382. Springer- 
Verlag, 1999. 




System Description: *SAT 
A Platform for the Development of Modal 
Decision Procedures* 



Enrico Giunchiglia and Armando Tacchella 



DIST, Universita di Genova 
Viale Causa 13 - 16145 Genova, Italy 
{enrico ,tac}@dist .unige . it 



Abstract. *SAT is a platform for the development of modal decision 
procedures. Currently, *SAT features decision procedures for the normal 
modal logic K(m) and for the classical modal logic E(m). *SAT embodies 
a state of the art SAT solver, and includes techniques for optimizing 
automated deduction in modal and temporal logics. Owing to its modular 
design and to the extensive reuse of software components, *SAT provides 
an open, easy to maintain, yet efficient implementation framework. 



1 Introduction 

In this paper we present *SAT, a platform for the development of SAT-based 
decision procedures. By SAT-based we mean built on top of a SAT solver in 
the spirit of [1]. Currently, *SAT features SAT-based decision procedures for the 
normal modal logic K(m), and for the classical modal logic E(m) [9,2]. 

The *SAT propositional engine is an embedded version of SATO 3.2, one of 
the most efficient SAT checkers publicly available [3]. We chose SATO because 
it is a fast propositional reasoner and it features many optimizations that we 
exploited in *SAT. We also implemented other optimizations that speed up modal 
reasoning, like: 

— early investigation of modal successors [1], 

— internal optimized clause form conversions [2], and 

— caching structures and retrieval algorithms [4]. 

*SAT has been designed to be modular and to allow for an easy integration 
of new decision procedures and optimizations. The system is implemented in C 
and extensively reuses software components from state-of-the-art systems, i.e., 
SATO, and the GLU library of data types from the vis model checking system [5] . 
The GLU library provides *SAT with efficient implementations of, e.g., lists, hash- 
tables, sparse-matrices. Taking SATO and GLU off-the-shelf, we inherit and exploit 

* We wish to thank Fausto Giunchiglia, Peter Patel-Schneider and Roberto Sebastiani 
for useful discussions related to *SAT. This work is supported by MURST. 
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in *SAT several years of experience in building highly optimized data structures 
and algorithms for automated deduction in propositional and temporal logics. 

*SAT source code, documentation and experimental results are available on 
the WWW at: 

http : //www.mrg. dist .unige . it/~tac/StarSAT .html 



2 Algorithms 

For sake of clarity, we introduce some preliminary notions. The set of formulas 
is constructed starting from a given set of propositional letters and applying the 
0-ary operators T and _L (representing truth and falsity respectively); the unary 
operators ^ and □; and the binary operators A, V, D and =.^ A modal logic 
is a set of formulas (called theorems) closed under tautological consequence. A 
formula (p is consistent in a modal logic L (or L- consistent) if is not a theorem 
of L, i.e., if ^(p ^ L. By atom we mean a propositional letter or a formula of the 
form Dtp. A literal is either an atom or the negation of an atom. An assignment 
is any conjunction p, of literals such that for any pair ip, ip' of conjuncts in p, it 
is not the case that ip = -'ip' . An assignment p satisfies a formula ip if p entails 
ip by propositional reasoning. 

Consider a formula ip. Let L be a modal logic. Whether ip is L-consistent can 
be determined by implementing two mutually recursive procedures: 

— Lsat((p) for the generation of assignments satisfying ip, and 

— LcONSiST(/i) for testing the L-consistency of each generated assignment p. 

The procedure Lsat is independent of the particular modal logic L considered, 
and can be based on any propositional decision procedure (see [2]). Indeed, 
the logic specific reasoning is delegated to Lconsist. Currently, *SAT features 
the procedures Econsist and Kconsist playing the role of Lconsist for the 
logics E(m) and K(m) respectively. For lack of space, we present only the Lsat 
algorithm here. For Econsist and Kconsist, see [2] and also [4]. 

The Lsat procedure implemented in *SAT is based on SATO 3.2 [3], an efficient 
implementation of the Davis-Putnam-Longemann-Loveland (DP) procedure [6] . 
Figure 1 shows an high-level description of Lsat. In the Figure: 

— cnf{(p) is a set of clauses obtained from ip by applying a conversion to con- 
junctive normal form (CNF) based on renaming (see, e.g., [7]). 

— choose-literaUpP , p) returns a literal occurring in <P and chosen according to 
some heuristic criterion. 

— if Hs a literal, I stands for Aiil = ~^A, and for A if I = 

— for any literal I and set of clauses, assign{l, <f>) is the set of clauses obtained 
from <P by pi) deleting the clauses in which I occurs as a disjunct, and {ii) 
eliminating I from the others. 

^ For simplicity, we consider the case with only one modality. 
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function LSAx(:p) return Lsatdp( cn/(:p), T). 



function Lsatdp(^, fJ.) 

if ^ = 0 then return Lconsist(/t); 
if 0 e ^ then return False, 
if { a unit clause {Z} is in ^ } 

then return LSATDp(fflssign(Z,^),/i A l)\ 
if not LcONSlST(/i) then return False, 

I := choose-literal($, jj,)-, 
return LSATDp(assign(/,^),/r A 1) or 
Lsatdp( assign (I, ^),/i A I). 



/* base */ 

/* backtrack */ 

/* unit */ 

/* early pruning */ 

/* split */ 



Fig. 1. Lsat and Lsatdp 



As can be observed, the procedure LSATpp in Figure 1 is the DP-procedure 
modulo (z) the call to Lconsist(^) when it finds an assignment /i satisfying the 
input formula (t? = 0), and (zz) the early pruning step, i.e., a call to Lconsist(^) 
that forces backtracking after each unit propagation when incomplete assign- 
ments are not L-consistent. Early pruning prevents *SAT from thrashing, i.e., 
from repeatedly generating different assignments that contain a same inconsis- 
tent kernel [1,8]. 

3 Implementation and Features 

The *SAT modular architecture is depicted in Figure 2. The thickest external box 
represents the whole system and, inside it, each solid box represents a different 
module. By module, we mean a set of routines dedicated to a specific task.^ The 
dashed horizontal lines single out the four main parts of *SAT: 

INTERFACE: The modules KRIS, KSATC, LWB, and TPTP are parsers for differ- 
ent input syntaxes. The module TREES stores the input formula as a tree, at 
the same time performing some simple preprocessing (e.g. pushing negations 
down to atoms). 

DATA: The module DAGS (for Directed Acyclic Graphs) implements the main 
data structures of *SAT. The input formula is preprocessed and stored as a 
DAG. A Look Up Table (LUT), mapping each atom into a newly intro- 
duced propositional letter C,/, is built. Then, each modal atom is replaced 
by the corresponding propositional letter. The initial preprocessing allows to 
map trivially equivalent^ modal atoms into a single propositional letter, thus 
fostering the detection of (un)satisfiable subformulae [1,10]. 

ENGINE: This part includes the module SAT, the propositional core of *SAX. 
Since SAT implements a DP algorithm, techniques like semantic branching, 

^ As a matter of fact, each module corresponds to a file in *SAT distribution package. 
® Technically, the preprocessing maps a formula (p into a formula (p' which is logically 
equivalent to ip in any classical modal logic (see [9] for the definition of classical 
modal logic). 
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K(m) 



E(m) 



Fig. 2. * SAT modular architecture 



boolean constraint propagation (BCP) and heuristic guided search are inher- 
ited for free by *SAT. The dashed box (labeled CNF) stands for a set of DPSAT 
routines implementing CNF conversions based on renaming. CNF routines al- 
low *SAT to handle any formula even if the SAT decider accepts CNF formulae 
only. 

LOGICS: Currently, *SAT features the two modules K(m) and E(m), implement- 
ing Kconsist and Econsist respectively. The dotted box is a placeholder 
for other L-consistency modules that will be implemented in the near future. 
CACHING implements data structures and retrieval algorithms that are used to 
optimize the L-consistency checking routines contained in the logic dependent 
modules (see [2] for caching in E(m) and [4] for caching in K(m)). 

The modules DPSAT, MONITOR and STAT span across different parts of *SAT. 
DPSAT interfaces the inner modules between them. The result is that these mod- 
ules are loosely coupled and can be modified/replaced (almost) independently 
from each other. MONITOR records information about *SAT performance, e.g, 
cpu time, memory consumption, number of L-consistency checks. STAT explores 
the preprocessed input formula and provides information like number of occur- 
rences of a variable, number of nested boxes. This information is used by different 
modules, e.g., for dimensioning the internal data structures. 

To understand the behavior of *SAT, let ip be the formula ^(^□(□C '2 A 
□ Cl) A nC2). *SAT first stores (p as an intermediate representation (provided by 
TREES) where it undergoes some preliminary transformations. In our case, p 
becomes (□(□C 2 A nCi) V ^DC 2 ). Then, the building of the internal represen- 
tation (provided by DAGS) causes lexical normalization ((nC2 A nCi) would be 
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Fig. 3. Internal representation of concepts in *SAT 



rewritten into (nCi ADC' 2 )) and propositional simplification (e.g., (CVC) would 
be rewritten into C) to be performed on ip. The resulting formula is represented 
by the data structure depicted in Figure 3 (left). Next, *SAT creates the LUT 
and replaces each modal atom with the corresponding propositional letter. The 
result is depicted in Figure 3 (right), where the numbers appearing in the LUT 
have the obvious meaning. Notice that the top-level formula po = is 

now purely propositional. If SAT accepts only CNF formulae then (z) for every 
LUT entry C^, both z/i and are converted to CNF and (ii) the top level 
formula po is replaced by its CNF conversion. Finally, the core decision process 
starts. SAT is properly initialized and called with po as input. Once a satisfying 
truth assignment is found, a logic dependent module (e.g. K(m)) is called to 
check its L-consistency. The recursive tests are built in constant time using the 
LUT to reference the subformulae. The process continues until no more truth 
assignments are possible or a model is found ([2] details this process for K(m), 
E(m) and several other classical modal logics). 
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DLP (Description Logic Prover) is an experimental description logic knowl- 
edge representation system. DLP implements an expressive description logic that 
includes propositional dynamic logic as a subset. DLP provides a simple inter- 
face allowing users to build knowledge bases of descriptions in this description 
logic, but, as an experimental system, DLP does not have a full user interface. 

Because of the correspondence between description logics and propositional 
modal logics, DLP can serve as a reasoner for several propositional modal logics. 
As well as propositional dynamic logic, the logic underlying DLP contains frag- 
ments that are in direct correspondence to the propositional modal logics 
and K4(rn)- DLP provides an interface that allows direct satisfiability checking 
of formulae in K(„i) and K4(m)- Using a standard encoding, the interface also 
allows satisfiability checking of formulae in KT(rn) and S4(rn)- 

DLP is available via the WWW at http://www.bell-labs.com/user/pfps. 
DLP is implemented in SML/NJ. The current version of DLP, version 4.1, in- 
cludes a number of new optimisations and options not included in previous 
versions. 

One of the purposes in building DLP was to investigate various optimisations 
for description logic systems. A number of these optimisations have appeared in 
various description logic systems [1,3,7]. As there is still need to investigate 
optimisations further and to develop new optimisation techniques, DLP has a 
number of compile-time options to select various description logic optimisations. 

DLP implements the description logic in Figure 1. In the syntax chart A is an 
atomic concept; C and D are arbitrary concepts; P is an atomic role; R and S are 
arbitrary roles; and n is an integer. There is an obvious correspondence between 
most of the constructs in this description logic and propositional dynamic logic, 
which is given in the chart. 



Implementation 

DLP uses the now-standard method for subsumption testing in description log- 
ics, namely translating subsumption tests into satisfiability tests and checking 
for satisfiability using an optimised tableaux method. DLP was designed from 
the beginning to be an experimental system. As a result, much more attention 
has been paid to making the internal algorithms correct and efficient in the 
worst-case than to reducing constant factors. Similarly, the internal data struc- 
tures have been chosen for their flexibility rather than having the absolute best 
modification and access speeds. Some care has been taken to make the internal 
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DL Syntax 


PDL Syntax 


Semantics 


Concepts 


A 


A 


A^ C 


(Formulae) 


T 


T 






T 


F 


0 




“iC 


~ C 


A^-C^ 




CV^D 


CAD 


c^nD^ 




CUD 


cy D 


C^UD^ 




3R.C 


{R)C 


{de A^:R^(d)nc^/0} 




'iR.C 


[R]C 


{de : R\d) c c^} 




^nP 




{d G : 1 R^(d) 1 > n} 




^nP 




{d G A^ : 1 R^{d) 1 < n} 




P:n 




{d G A^ : R^{d) B n} 


Roles 


P 


P 


P^ C A^ X A^ 


(Modalities) 


RuS 


RUS 


R^US^ 


(Actions) 


RoS 


R-,S 


RF 




RjC 


R-,C7 


R^ n (A^ X C^) 






R ; R* 





Fig. 1. Simplified Syntax for DLP 



data structures reasonably fast, however — there is considerable use of binary 
maps and hash tables instead of lists to store sets, for example. 

DLP is implemented in SML/NJ instead of a language like C so that it can 
be more-easily changed. There is some price to be paid for this, as SML/NJ 
does not allow some of the low-level optimisations possible in languages like 
C. Further, DLP is implemented in a mostly-functional fashion. The only non- 
functional portions of the satisfiability checker in DLP have to do with unique 
storage of formulae, and caching of several kinds of information. All this caching 
is monotone, i.e., it does not have be undone during a proof, or even between 
proofs. Nonetheless, DLP is quite fast on several problem sets, including the 
Tableaux’98 propositional modal logic comparison benchmark [9] and several 
collections of hard random formulae in K [10,8,11]. 



Optimisation Techniques 

Many of the optimisation techniques in DLP have already appeared in various 
description logic systems. The most complete description of these optimisations 
can be found in Ian Horrocks’ thesis [7]. The basic algorithm in DLP is a simple 
tableau algorithm that searches for a model that demonstrates the satisfiability 
of a description logic description or, equivalently, a propositional modal logic 
formula. The algorithm process modal constructs by building successor nodes 
with attached formulae that represent related possible worlds. The algorithm 
incorporates the usual control mechanism to guarantee termination, including a 
check for equality of the formulae at nodes to guarantee termination for transitive 
roles (modalities). 
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Before the model search algorithm in DLP starts, incoming formulae are 
converted into a normal form, and common sub-formulae are uniquely stored. 
This conversion detects analytically satisfiable sub-formulae. This unique storage 
of formulae also allows values to be efficiently given to any sub-formula in the 
formula, not just propositional variables. This can result in search failures being 
detected much earlier than would otherwise be the case. 

DLP performs semantic branching search. When DLP decides to branch, 
it picks a formula and assigns that formula to true and false in turn instead of 
picking a disjunction and assigning each of its disjuncts to true in turn. Semantic 
branching is guaranteed to explore each section of the search space at most once, 
as opposed to syntactic branching, and this is important in propositional modal 
logics as the generation and analysis of successors can result in large overlap in 
the search space when using syntactic branching. 

DLP looks for formulae whose value is determined by the current set of 
assignments, and immediately gives these formulae the appropriate value. This 
technique can result in dramatic reductions in the search space, particularly in 
the presence of semantic branching. 

For every sub-formula DLP keeps track of which choice points lead to the 
deduction of that sub-formula. When backtracking to a choice point, DLP checks 
to see if the current search failure depends on that choice; if it does not, the 
alternative branch need not be considered, as it would just lead to the same 
failure. This technique, often called backjumping [2], can dramatically reduce 
the search space, but does have some overhead. 

During a satisfiability check successor nodes with the same set of formulae 
as a previously-encountered node are often generated. As all that matters is 
whether the node is satisfiable or not, DLP caches and reuses their status. Care 
has to be taken to ensure that caching does not interfere with the rest of the 
algorithm, particularly the determination of dependencies and loop analysis. 
Caching does require that information about each node generated be retained 
for a longer period of time than required for a basic depth-first implementation 
of the satisfiability checker. However, caching can produce dramatic gains in 
speed. 

There are many heuristic techniques that can be used to determine which 
sub-formula to branch on first. However, these techniques require considerable 
information to be computed for each sub- formula of the unexpanded disjunc- 
tions. Further, the heuristic techniques available have mostly been devised for 
non-modal logics and are not necessarily suitable for modal logics. Nonetheless, 
DLP includes some simple heuristics to guide its search, mostly heuristics for 
more-effective backjumping. 



New Techniques 

Version 4.1 of DLP includes quite a number of new techniques to improve its 
performance. 

In previous versions of DLP, the cache did not include dependency informa- 
tion, which meant that a conservative approximation to this information had 
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to be made, possibly resulting in less-than-optimal backjumping. The formula 
cache has now been expanded to incorporate the dependency information needed 
in backjumping, so that caching does not interfere with backjumping. Of course, 
this does increase the size of cache entries. 

The low-level computations in DLP used to be quite expensive for very large 
formulae. If the formula was also difficult to solve, this cost would be masked 
by the search time, but if the formula was easy to solve, the low-level compu- 
tation cost would dominate the solution time. Version 4.1 of DLP dramatically 
reduces the time taken for low-level computations both by reducing the amount 
of heuristic information generated when there are many clauses active and also 
by caching some of this information so that it does not have to be repeatedly 
computed. Of course, DLP is still much slower on large-but-easy formulae than 
provers that use imperative techniques, but such provers are much harder to 
build and debug than DLP. 

DLP used to completely generate assignments for the current node before 
investigating any modal successors. The current version of DLP has an option to 
investigate modal successors whenever a choice point is encountered, a technique 
taken from KSatC [5]. This option can be beneficial but often increases solution 
times. 

DLP can now retain not only the status of nodes, but the model found if the 
node is satisfiable. This model can be used to restart the search when reinves- 
tigating modal successors, reducing the time overhead for early investigation of 
modal successors — at the cost of considerably increasing the space required for 
the cache. DLP can now also return a model for satisfiable formulae. 

DLP now incorporates a variant of dynamic backtracking [4]. When jumping 
over a choice point, a determination is made as to whether any invalidated 
branch(es) from that choice point depends on the choice being changed. If it 
does not, then the search ignores the invalidated branch(es) when the choice 
point is again encountered. 



Summary 

DLP has not been used in any actual applications, and as an experimental sys- 
tem, it is unlikely to receive any such use. DLP has been used to classify a 
version of the Galen medical knowledge base [12]. DLP performed capably on 
this knowledge base, creating the subsumption partial order in 210 seconds on 
a Sparc Ultra 1-class machine. DLP has also been tested on several sets of 
benchmarks, including the Tableaux’98 comparison benchmarks [6] and several 
collections of hard random modal formulae [10,8,11]. DLP is the fastest modal 
decision procedure for many of these tests. 

As it is an experimental system, I did not expect DLP to be particularly fast 
on hard problems. It was gratifying to me that it is competitive with existing 
propositional modal reasoners including FaCT and KSatC. My current plan for 
DLP is to incorporate inverse roles (converse modalities), a change that requires 
considerable modification to the implementation of the system. 
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Abstract. This article introduces two techniques to improve the prop- 
agation efficiency of CSP based finite model generation methods. One 
approach consists in statically rewriting some selected clauses so as to 
trigger added constraint propagations. The other approach uses a dy- 
namic lookahead strategy to both Hlter out inconsistent domain values 
and select the most appropriate branching variable according to a first 
fail heuristic. 



1 Introduction 

Many methods have been implemented to deal with many-sorted or uni-sorted 
theories: FINDER [7], FMSET [3], SATO [8], SEM [11], EMC [5] are known 
systems which solved some open problems. 

The method SEM (System for Enumerating Models) introduced by J. Zhang 
and H. Zhang in [11] is one of the most powerful known methods for solving 
problems expressed as many-sorted theories. 

The goal of this article is to explore ways to improve SEM by increasing the 
propagations it performs (i.e. the number of inferred negative ground literals) so 
as to reduce the search space and overall computation time. A hrst possible im- 
provement is a static preprocessing which automatically rewrites clauses having 
a specihc structure. 

A second improvement consists in a dynamic domain hltering achieved by 
using a lookahead at some nodes of the search tree. This lookahead procedure 
uses unit propagation and detects incompatible assignments (e.g. trying /(O) = 
0, then /(O) = 1 ...). This hltering is augmented by the introduction of a new 
heuristic, in the spirit of the SATZ propositional solver (see [4]). 

This article is organized as follows: Section 2 dehnes the hrst-order logic 
theories accepted as input language and the background of the SEM algorithm. 
In section 3 we study two techniques which improve SEM eihciency. In section 4, 
we compare our work with other methods on mathematicals problems. Section 
5 concludes. 
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2 Background and SEM Description 

The theories accepted as input by the model generator SEM are many sorted 
hrst order theories, with equality, without existential quantihers, in clause normal 
form (CNF). Since we are interested in hnite models only, all sorts are hnite. 
Because all the variables are universally quantihed, the quantihers are usually 
omitted. 

We call the degree of a literal the number of its functional symbol occurrences. 
We call a eell the ground term f{e\,...ek) where all are sort elements. An 
interpretation of a theory maps each cell to a value from the appropriate sort. 
A model of a theory is an interpretation which satishes all its clauses. 

As an initial preprocessing stage, SEM expands the original theory axioms 
to the set of their terminal instances (i.e. ground clauses), by substituting for 
each logical variable all the members of the appropriate sort. SEM’s hnite model 
search algorithm is described in hgure 1. It uses the following parameters: A 
the set of assignments, B the set of unassigned cells and their possible values 
and C the set of ground clauses. The function Propa of the search algorithm 
propagates the assignment from A to C . This simplihes C and may force some 
cells in B to become assigned. It modihes (A, B, C) until a hxed point is reached 
or an inconsistency is detected, and returns the modihed triple (A, B, C) upon 
success. For a full description of SEM and the propagation algorithm one can 
refer to [9] and [11]. 



Function Search(Al, B, C): Return Boolean 
If B — 0 Then Return TRUE 
Choose and delete {cei^Di) from B 
If Bi - 0 Then Return FALSE 
for All e ^ Di Do 

( A' ^ B' , C') — Propa( A U (ce^, e) , B, C) 
If C' / False Then Search(A', B',C') 



Algorithm 1: SEM Search Algorithm 



3 Two Domain Filtering Techniques 

SEM’s propagation algorithm allows propagation of negative assignments only 
when literals with the form ce! = e exist in the set of clauses (ce is a cell and e an 
element). Otherwise, only positive facts (value assignments) are propagated. This 
leads to an increase in the number of decision points necessary for the search, and 
potentially increases run times. Because SEM performs some amount of value 
elimination using negative facts, one approach consists in favoring it by rewriting 
some clauses to give them the appropriate structure. This static technique is 
performed in a preprocessing phase. 
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The second approach is dynamic, involving computations performed at each 
decision node. It consists in using a lookahead technique, called unit propagation, 
to eliminate selected values from the domains of selected cells. 



3.1 Clauses Transformation 

SEM performs value elimination when clauses contain negative literals of degree 
two. We can thus rewrite some selected clauses, using a flattening technique 
as in FMSET [3] to rewrite clauses to logically equivalent clauses containing 
negative literals of degree two. Because such a transformation introduces auxil- 
iary variables, and thus increases the number of ground clause instances, such a 
rewriting is a tradeoff. Candidate clauses are thus carefully selected: we restrict 
the rewriting process to clauses of degree 3. This transformation allows to dras- 
tically reduce on some problems the number of decision points and the execution 
times as shown in section 4, the results obtained with this rewriting technique 
are listed under the name CTSEM (Clause Transformation in SEM) . 

Definition 1. A reducible clause is a clause which contains a literal with the 
following pattern: f{xi,...Xm,g{xk+iT--,xi),Xm+i---Xk) = xq where f,g are two 
functional symbols and are variables. Such a literal is called reducible. 

By using the clause transformation algorithm described in [3] , we can rewrite 
each reducible literal to the form: f{xi,... ,Xm,v,Xm+i---Xk) = xq V v 
g{xk+i^ ■ ■ ■ ) a;/). This preserves the semantics of the clause, and introduces the 
negative literal v g{xk+i^ ■ ■ ■ ) a;/). It requires the introduction of an auxiliary 
variable v. 

Example 1. The literal h{h{x, y),x) = y is reducible and can be transformed to 
its logical equivalent h{v, x) = y\/ v h{x, y). Now, the ground clause h(0, 1) 
OV h(0,0) = 1 exists in the set of ground clauses. When we assign h(0,0) to 0 
the second literal of the previous clause becomes false. So SEM can propagate 
the fact h(0, 1) 0. This eliminates 0 from the domain of the cell h(0, 1). 



3.2 Value Elimination 

At a given node of the search tree, let B equal the set of yet unassigned cells 
and Ca the set of axioms simplihed by assignments of cells in A. In other words 
Ca = C such that {A',B',C') = Propa{A,B,C). Let {ce,Dce) G B and let 
e G Dee- If Propa{A\J {(ce, e)},B, C) leads to inconsistency, then we can remove 
the value e from Dee- 

However, such unit propagations are time consuming. We must restrict the 
number of calls to Propa in order to obtain an elhcient method. We use here a 
property similar to the one introduced for SAT problems in [2] . After the call to 
Propa{A U {(ce, e)}, H, C), there are two possibilities: 
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“ C'viu{(ce,e)} = False and then Dee = Dee — {e}: Value elimination. 

“ FAu{(ce,e)} 7 ^ False: the value assignments (cz, Cj) propagated during the 
process are compatible with the current assignment and would not lead to 
value elimination if tried later. 

This drastically reduces the number of possible candidates for propagation, 
and minimizes the number of calls to Propa. Formally, we have the following 
propositions : 

Proposition 1. Let (ce, e) G B, C'qu{(ce,e)} H -L then C'a is equivalent to 
Ca a (ce ^ e). 

This property is used to eliminate values from the cell domains. 
Proposition 2. Let (ce, e) G B, if CAu{ice,e)} H (cei, ei), ...(ce„, e„) and if 

F AU{(ce,e)} ^ -f then Vz G Tl'^\cei G B^ F Au{{cei,ei)} ^ -f 

This property avoids to propagate useless facts. This allows to perform fewer 
calls to the Propa procedure. 

An additional possibility to reduce the number of unit propagations is to 
select which cells must be tried. We note T C B the set of cells which are 
candidates for unit propagation. Because of symmetries (LNH), only the cells 
with indices less or equal than mdn need to be considered. We call those cells 
mdn eells. The results obtained using this dynamic hltering technique are listed 
in section 4 under the name VESEM. 



3.3 A First Fail Heuristic 

In its original version, SEM chooses as the next cell the one with the smallest 
domain, and tries to not increment mdn. We note H these previous conditions. 
Then the heuristic chooses as the next variable to instantiate, the one that 
both satishes conditions H and that maximizes the count of the number of 
propagations done on each cell for all their possible values. This approach is 
similar to the one described in [4] for propositional logic. The algorithm of this 
heuristic and value elimination process is shown in the algorithm 2. 

Remark 1. In algorithm 2, Mark[ce,e]=True means that the value e of the cell 
ce can be suppressed. The number Nb equals the number of propagations. 

4 Experimentations 

We compare SEM, VESEM (SEM + Value Elimination), CTSEM (SEM + 
Clause Transformation preprocessing) and CTVESEM (SEM + Clause Trans- 
formation + Value Elimination) on a set of well known problems. Run times are 
in seconds. All experiments were carried out under Linux on a K6II 400 PC with 
128 MB of RAM. A indicates that a program fails to solve a problem in less 
than two hours. 
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Function Up Heuristic(A, C): Return Next cell to choose 
For All (ce, DJgTUo 

For All e G D Do Mark[ce, e]=True 
For All (ce, D) G T Do 
Nb^O 

For All e G D such that Mark[ce, e]=True Do 
{A' , B' , C')=Propa(A U (ce, e),B,C) 

For All {ce ,e) propagated Do Mark[ce', e']=False 

Nh^ Nh + l 

If C — False Then 

D ^ D - {e} 

If |D| — 1 Then return ce 
Else 

w{ce) — w{ce) + Nb 

Return ce with the smallest domain and maximising w 



Algorithm 2: Value Elimination and Heuristic 



4.1 Quasigroup Problems 

A quasigroup is a binary operator such that the equations a.x = b and x.a = b 
have an unique solution for all a, b. We deal here with idempotent quasigroups, 
statisfying the additional axiom (x.x = x). Adding different extra axioms leads 
to several problem instances, fully described in [6]. None of these axioms are 
reducible. The results obtained with quasigroups are listed in table 1. 

The results show that VESEM always explores fewer nodes than SEM. The 
amount of memory required to solve these problems is the same with both algo- 
rithms. Because of the cost of computing the heuristic, computation times are 
not signihcantly improved in general except on one example (QG6). Only two 
examples (QG7 and QGl) exhibit results slightly worse with VESEM than with 
SEM. Although the quasigroup problems do not clearly prove a superiority of 
VESEM, they show that the value elimination and lookahead strategy generally 
results in a favorable tradeoff and should be used. 



4.2 Group and Ring Problems 

We compare VESEM and CTSEM and CTVESEM to SEM on a list of group and 
ring problems described by J. Zhang in [10]. The results are listed in table 2. Our 
algorithms explore fewer nodes than SEM. The lookahead strategy implemented 
in VESEM generally leads to improved computation times. The execution time 
ratio is sometimes very important: about 60 for NG and GRP. 

CTSEM and VESEM not only solve problems faster, but solve problems of 
much higher orders (NG, GRP, RU). To the best of our knowledge, it is the hrst 
time that a program ever computes a hnite model for NG34 and RU24 or proves 
the inconsistency of GRP38. 

The program CTVESEM combining both techniques (Clause Transformation 
and Value Elimination) visits fewer search tree nodes. But, almost all the values 
suppressed (leading to skipped nodes) are due to the clause rewriting technique. 
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Table 1. Quasigroup Problems - Comparison. 





1 SEM 


VESEM 


Problem 


Nb Model 


Time 


Nodes 


Time 


Nodes 


QGl 7 


4 


22 


411 


24 


194 


QG2 6 


0 


1.4 


17 


1.4 


9 


QG2 7 


3 


63 


871 


59 


401 


QG3 9 


0 


7.4 


48 278 


6.3 


40 015 


QG3 10 


0 


1416 


7 948 372 


1335 


3 558 564 


QG4 9 


74 


6.3 


38 407 


5.3 


17 116 


QG4 10 


0 


1263 


6 946 603 


1 099 


2 941 094 


QG5 14 


0 


83 


320 728 


53 


106 703 


QG5 15 


0 


2031 


7 518 920 


1306 


2 251 311 


QG6 11 


0 


40 


840 542 


2.3 


13 690 


QG6 12 


0 


2519 


50 290 872 


142 


929 781 


QG7 13 


2 


14.5 


69 053 


16 


37 132 


QG7 14 


0 


443 


2 015 778 


528 


1 107 404 



Thus, adding value elimination to clause transformation is redundant and re- 
sults in increased computation times. All results obtained and a fully detailed 
description of the different algorithms described in this paper are available in 
[ 1 ]. 



Table 2. Ring and Group Problems - Comparison 





1 SEM 


VESEM 


GTSEM 


GTVESEM 


Problem 


Nb Models 


Time 


Nodes 


Time 


Nodes 


Time 


Nodes 


Time 


Nodes 


AG 28 


162 


328 


642 103 


321 


76 663 


336 


57 941 


394 


41 859 


AG 32 


2 295 


940 


2 037 525 


956 


624 304 


968 


101 356 


1 272 


76 393 


NG 28 


51 


6 934 


8 359 103 


806 


108 120 


432 


100 036 


1 105 


88 832 


NG 29 


0 


+ 




752 


94 417 


489 


108 922 


1 191 


82 519 


NG 34 


3 


+ 




5 450 


504 182 


3 469 


478 337 


+ 




GRP 31 


0 


3 831 


2 751 805 


272 


21 821 


97 


14 711 


378 


24 691 


GRP 32 


2 712 


+ 




1 620 


740 797 


529 


35 546 


2 204 


93 420 


GRP 38 


0 


+ 




6 690 


584 374 


3 480 


442 039 


+ 




RU 19 


1 


4 591 


2 720 769 


1729 


94 326 


848 


197 953 


1 666 


15 741 


RU 20 


21 


+ 




2 904 


370 652 


3 678 


609 320 


2 957 


336 612 


RU 24 


445 


+ 




5 029 


434 006 


+ 




5 019 


366 597 


RNA 14 


0 


592 


646 421 


+ 




354 


131 355 


426 


45 162 


RNA 15 


0 


592 


646 421 


1 021 


150 538 


513 


144 613 


682 


56 623 


RNA16 


? 


+ 




+ 




+ 




+ 




RNB 17 


0 


15 


13 148 


20 


6 389 


15 


2 287 


21 


1 309 


RNB 18 


0 


16 


13 238 


36 


2 171 


16 


2 377 


34 


852 
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5 Conclusion 

We introduce two techniques that can be used to improve CSP approaches to 
hnite model generation of hrst order theories. Their elhciency stems from the 
introduction of negative facts in the clause transformation technique case (CT- 
SEM), and from the elimination of domain values at some node of the search 
tree in the dynamic hltering case (VESEM) . 

The behaviour of the algorithms on the AG and RNA problems suggests to 
search for improvements in the heuristic strategy associated with the lookahead 
procedure in VESEM, and also to eliminate more isomorphic subspaces than is 
actually done with the LNH heuristic used in those programs. VESEM seems 
to provide the basis for a general algorithm for hnite model search of hrst order 
theories. 
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Abstract. This paper is concerned with methods that automatically 
prove termination of term rewrite systems. The aim of dummy elimina- 
tion, a method to prove termination introduced by Ferreira and Zantema, 
is to transform a given rewrite system into a rewrite system whose termi- 
nation is easier to prove. We show that dummy elimination is subsumed 
by the more recent dependency pair method of Arts and Giesl. More pre- 
cisely, if dummy elimination succeeds in transforming a rewrite system 
into a so-called simply terminating rewrite system then termination of 
the given rewrite system can be directly proved by the dependency pair 
technique. Even stronger, using dummy elimination as a preprocessing 
step to the dependency pair technique does not have any advantages 
either. We show that to a large extent these results also hold for the 
argument filtering transformation of Kusakari et al. 



1 Introduction 

Traditional methods to prove termination of term rewrite systems are based 
on simplification orders, like polynomial interpretations [6,12,17], the recursive 
path order [7,14], and the Knuth-Bendix order [9,15]. However, the restriction 
to simplification orders represents a significant limitation on the class of rewrite 
systems that can be proved terminating. Indeed, there are numerous important 
and interesting rewrite systems which are not simply terminating, i.e., their ter- 
mination cannot be proved by simplification orders. Transformation methods 
(e.g. [5,10,11,16,18,20,21,22]) aim to prove termination by transforming a given 
term rewrite system into a term rewrite system whose termination is easier to 
prove. The success of such methods has been measured by how well they trans- 
form non-simply terminating rewrite systems into simply terminating rewrite 
systems, since simply terminating systems were the only ones where termination 
could be established automatically. 

In recent years, the dependency pair technique of Arts and Giesl [1,2] emerged 
as the most powerful automatic method for proving termination of rewrite sys- 
tems. For any given rewrite system, this technique generates a set of constraints 
which may then be solved by standard simplification orders. In this way, the 
power of traditional termination proving methods has been increased signifi- 
cantly, i.e., the class of systems where termination is provable mechanically by 
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the dependency pair technique is much larger than the class of simply terminat- 
ing systems. In light of this development, it is no longer sufficient to base the 
claim that a particular transformation method is successful on the fact that it 
may transform non-simply terminating rewrite systems into simply terminating 
ones. In this paper we compare two transformation methods, dummy elimination 
[11] and the argument filtering transformation [16], with the dependency pair 
technique. With respect to dummy elimination we obtain the following results: 

1. If dummy elimination transforms a given rewrite system TZ into a simply 
terminating rewrite system TZ' , then the termination of TZ can also be proved 
by the most basic version of the dependency pair technique. 

2. If dummy elimination transforms a given rewrite system TZ into a DP simply 
terminating rewrite system TZ' , i.e., the termination of TZ' can be proved by 
a simplification order in combination with the dependency pair technique, 
then TZ is also DP simply terminating. 

These results are constructive in the sense that the constructions in the proofs 
are solely based on the termination proof of TZ' . This shows that proving termi- 
nation of TZ directly by dependency pairs is never more difficult than proving 
termination of TZ' . The second result states that dummy elimination is useless 
as a preprocessing step to the dependency pair technique. Not surprisingly, the 
reverse statements do not hold. In other words, as far as automatic termination 
proofs are concerned, dummy elimination is no longer needed. 

The recent argument filtering transformation of Kusakari, Nakamura, and 
Toyama [16] can be viewed as an improvement of dummy elimination by incor- 
porating ideas of the dependency pair technique. We show that the first result 
above also holds for the argument filtering transformation. The second result 
does not extend in its full generality, but we show that under a suitable restric- 
tion on the argument filtering applied in the transformation of TZ to TZ' , DP 
simple termination of TZ' also implies DP simple termination of TZ. 

The remainder of the paper is organized as follows. In the next section we 
briefly recall some definitions and results pertaining to termination of rewrite 
systems and in particular, the dependency pair technique. In Section 3 we relate 
the dependency pair technique to dummy elimination. Section 4 is devoted to 
the comparison of the dependency pair technique and the argument filtering 
transformation. We conclude in Section 5. 

2 Preliminaries 

An introduction to term rewrite systems (TRSs) can be found in [4] , for example. 
We first introduce the dependency pair technique. Our presentation combines 
features of [2,13,16]. Apart from the presentation, all results stated below are due 
to Arts and Giesl. We refer to [2,3] for motivations and proofs. Let 7^ be a (finite) 
TRS over a signature T . As usual, all root symbols of left-hand sides of rewrite 
rules are called defined, whereas all other function symbols are constructors. Let 
!F‘^ denote the union of T and | / is a defined symbol of TZ] where /** has 
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the same arity as /. Given a term t = f{ti , . . . , G T{!F, V) with / defined, 
we write for the term , t„). If ^ ^ r G 7^ and t is a subterm of r with 

defined root symbol then the rewrite rule is called a dependency pair of 

TZ. The set of all dependency pairs of TZ is denoted by DP(T^). In examples we 
often write F for f**. 

For instance, consider the following well-known one-rule TRS TZ from [8] : 

f(f(:r))-f(e(f(a:))) (1) 

Here f is defined, e is a constructor, and DP (7^) consists of the two dependency 
pairs 



F{f{x)) ^ F{e{f{x))) F{f{x)) ^ F(a:) 

An argument filtering [2] for a signature IF is a mapping tt that associates with 
every n-ary function symbol an argument position z G {1, . . . , n} or a (possibly 
empty) list [ii, . . . ,im] of argument positions with 1 ^ zi < • • • < Zm ^ n. 
The signature consists of all function symbols / such that 7 t(/) is some list 
[zi, . . . , im], where in the arity of f is m. Every argument filtering tt induces 
a mapping from T(lF, V) to T(lF,r, V), also denoted by tt: 

{ t if t is a variable, 

TT{ti) if t = /(ti, . . .,t„) and 7 t(/) = z, 

f{n{ti^), . ..,TT{ti^)) if t = f{ti, ...,tn) and n{f) = [ii, . ..,im]- 

Thus, an argument filtering is used to replace function symbols by one of their 
arguments or to eliminate certain arguments of function symbols. For example, if 
7r(f) = 7 t(F) = [1] and 7r(e) = 1, then we have 7r(F(e(f(a;)))) = F(f(a;)). However, 
if we change 7r(e) to [], then we obtain 7r(F(e(f(a;)))) = F(e). 

A preorder (or quasi-order) is a transitive and reflexive relation. A rewrite 
preorder is a preorder F on terms that is closed under contexts and substitutions. 
A reduction pair [16] consists of a rewrite preorder F and a compatible well- 
founded order > which is closed under substitutions. Here compatibility means 
that the inclusion F . > C > or the inclusion > • F C > holds. In practice, 
> is often chosen to be the strict part F of F (or the order where s > t iff 
sa >- ta for all ground substitutions a). The following theorem presents the 
(basic) dependency pair approach of Arts and Giesl. 

Theorem 1. A TRS TZ over a signature T is terminating if and only if there 
exists an argument filtering tt for and a reduction pair (F, >) such that 
'!^{TZ) C F and 7t(DP(7^)) C >. 

Because rewrite rules are just pairs of terms, 7t(7^) C F is a shorthand for 
tt{1) F 7 t( 7 ’) for every rewrite rule I r G TZ. In our example, when using 
7r(e) = [], the inequalities f(f(a;)) ^ f(e), F(f(a;)) > F(e), and F(f(a;)) > F(a;) 
resulting from the dependency pair technique are satisfied by the recursive path 
order, for instance. Hence, termination of this TRS is proved. 




312 Jiirgen Giesl and Aart Middeldorp 



Rather than considering all dependency pairs at the same time, like in the 
above theorem, it is advantageous to treat groups of dependency pairs separately. 
These groups correspond to clusters in the dependency graph of TZ. The nodes 
of the dependency graph are the dependency pairs of TZ and there is an arrow 
from node l\ ^ t\ to l\ ^ t\ if there exist substitutions a\ and a 2 such that 
t\ai — l\a 2 - (By renaming variables in different occurrences of dependency 
pairs we may assume that a\ = 02 -) The dependency graph of TZ is denoted by 
DG(T^). We call a non-empty subset C of dependency pairs of DP (7^) a cluster 
if for every two (not necessarily distinct) pairs l\ ^ t\ and l\ ^ t\ in C there 
exists a non-empty path in C from l\ ^ t\ to l\ ^ t\. 

Theorem 2. A TRS TZ is terminating if and only if for every cluster C in 
DG(T^) there exists an argument filtering tt and a reduction pair (^, >) such 
that tt{TZ) C 7t(C) C ^ U >, and tt{C) n > yf 0. 

Note that n{C) n > yf 0 denotes the situation that n{l^) > n{t^) for at least 
one dependency pair l^ ^ t^ £ C. 

In the above example, the dependency graph only contains an arrow from 
F(f(a;)) ^ F(a;) to itself and thus |F(f(a;)) — > F(a;)} is the only cluster. Hence, 
with the refinement of Theorem 2 the inequality F(f(a;)) > F(e) is no longer nec- 
essary. See [3] for further examples which illustrate the advantages of regarding 
clusters separately. 

Note that while in general the dependency graph cannot be computed au- 
tomatically (since it is undecidable whether t\a — l^fJ holds for some a), one 
can nevertheless approximate this graph automatically, cf. [1,2,3, “estimated 
dependency graph”]. In this way, the criterion of Theorem 2 can be mechanized. 

Most classical methods for automated termination proofs are restricted to 
simplification (pre)orders, i.e., to (pre)orders satisfying the subterm property 
f{. . .t . . .) >- t or f{. . .t . . .) ^ t, respectively. Hence, these methods cannot 
prove termination of TRSs like (1), as the left-hand side of its rule is embedded 
in the right-hand side (so the TRS is not simply terminating). However, with 
the development of the dependency pair technique now the TRSs where an 
automated termination proof is potentially possible are those systems where 
the inequalities generated by the dependency pair technique are satisfied by 
simplification (pre)orders. 

A straightforward way to generate a simplification preorder ^ from a sim- 
plification order is to define s ^ t it s >- t or s = t, where = denotes syn- 
tactic equality. Such relations ^ are particularly relevant, since many existing 
techniques generate simplification orders rather than preorders. By restricting 
ourselves to this class of simplification preorders, we obtain the notion of DP 
simple termination. 

Definition 1. A TRS TZ is called DP simply terminating if for every cluster C 
in DG(T^) there exists an argument filtering it and a simplification order >- such 
that tt(JZ U C) C ^ and 7t(C) n yf 0. 

Simple termination implies DP simple termination, but not vice versa. For 
example, the TRS (1) is DP simply terminating, but not simply terminating. 
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The above definition coincides with the one in [13] except that we use the real 
dependency graph instead of the estimated dependency graph of [1,2,3]. The 
reason for this is that we do not want to restrict ourselves to a particular com- 
putable approximation of the dependency graph, for the same reason that we do 
not insist on a particular simplification order to make the conditions effective. 

3 Dummy Elimination 

In [11], Ferreira and Zantema defined an automatic transformation technique 
which transforms a TRS TZ into a new TRS dummy(T^) such that termination 
of dummy(T^) implies termination of TZ. The advantage of this transformation 
is that non-simply terminating systems like ( 1 ) may be transformed into simply 
terminating ones. Thus, after the transformation, standard techniques may be 
used to prove termination. 

Below we define Ferreira and Zantema’s dummy elimination transformation. 
While our formulation of dummy (7^) is different from the one in [11], it is easily 
seen to be equivalent. 

Definition 2. Let TZ he a TRS over a signature T . Let e he a distinguished 
function symbol in T of arity m ^ 1 and let o he a fresh constant. We write 
TFo for {T \ {e}) U {o}. The mapping cap: T{TF,V) T(iFo, V) is inductively 
defined as follows: 

(t ifteV, 

cap(t) = < o ift = e(ti,...,tm), 

[/(cap(ti), . . .,cap(t„)) ift = f{ti ,. . .,t„) with / yf e. 

The mapping dummy assigns to every term in T{TF,V) a subset o/T(iFo, V), as 
follows: 

dummy(t) = {cap(t)} U {cap(s) \ s is an argument of an e symbol in t}. 
Finally, we define 

dummy (7^) = {cap(l) ^r'\l^rGTZ and r' G dummy(r)}. 

The mappings cap and dummy are illustrated in Figure 1, where we assume 
that the numbered contexts do not contain any occurrences of e. Ferreira and 
Zantema [11] showed that dummy elimination is sound. 

Theorem 3. Let TZ he a TRS. Lf dummy(T^) is terminating then TZ is termi- 
nating. 

For the one-rule TRS (1), dummy elimination yields the TRS consisting of 
the two rewrite rules 



f(f(x)) ^ f(o) f(f(x)) ^ f(x) 
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Fig. 1. The mappings cap and dummy. 



In contrast to the original system, the new TRS is simply terminating and its ter- 
mination is easily shown automatically by standard techniques like the recursive 
path order. Hence, dummy elimination can transform non-simply terminating 
TRSs into simply terminating ones. However, as indicated in the introduction, 
nowadays the right question to ask is whether it can transform non-DP simply 
terminating TRSs into DP simply terminating ones. Before answering this ques- 
tion we show that if dummy elimination succeeds in transforming a TRS into a 
simply terminating TRS then the original TRS is DP simply terminating. Even 
stronger, whenever termination of dummy(T^) can be proved by a simplification 
order, then the same simplification order satisfies the constraints of the depen- 
dency pair approach. Thus, the termination proof using dependency pairs is not 
more difficult or more complex than the one with dummy elimination. 

Theorem 4. Let TZ be a TRS. If dummy(T^) is simply terminating then TZ is 
DP simply terminating. 

Proof. Let T be the signature of TZ. We show that TZ is DP simply terminating 
even without considering the dependency graph refinement. So we define an 
argument filtering tt for and a simplification order on V) such that 

^ and 7 t(DP( 7^)) C The argument filtering tt is defined as follows: 
7r(e) = [] and 7 t(/) = [1, . . . , n] for every n-ary symbol / G (iF\ {e})**. Moreover, 
if e is a defined symbol, we define 7r(e**) = []. Let □ be any simplification order 
that shows the simple termination of dummy(T^). We define the simplification 
order on T(JF|, V) as follows: s t if and only if s' □ t' where (•)' denotes the 
mapping from T(T\^ V) to V) that first replaces every marked symbol F 

by / and afterwards replaces every occurrence of the constant e by o. Note that 
and □ are essentially the same. It is very easy to show that 7r(t)' = 7r(t**)' = cap(t) 
for every term t G T{F, V). Let I ^ r gTZ. Because cap(l) ^ cap(r) is a rewrite 
rule in dummy(T^), we get n{iy = cap(l) □ cap(r) = 7r(r)' and thus n{l) >- n{r). 
Hence n{TZ) C and thus certainly n{TZ) C Now let l^ ^ t** be a dependency 
pair of TZ, originating from the rewrite rule I ^ r G TZ. From f < r (< denotes 
the subterm relation) we easily infer the existence of a term u G dummy(r) 
such that cap(t) < u. Since cap(^) ^ u is a rewrite rule in dummy(T^), we have 
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n{l^y = cap(^) □ u. The subterm property of □ yields u □ cap(t) = Tr{t^)' . Hence 
ttIp)' □ Tvit'^y and thus 7 t(^**) 7r(t**). We conclude that 7 t(DP(7^)) C □ 

The previous result states that dummy elimination offers no advantage com- 
pared to the dependency pair technique. On the other hand, dependency pairs 
succeed for many systems where dummy elimination fails [1,2] (an example is 
given in the next section). One could imagine that dummy elimination may 
nevertheless be helpful in combination with dependency pairs. Then to show 
termination of a TRS one would first apply dummy elimination and afterwards 
prove termination of the transformed TRS with the dependency pair technique. 
In the remainder of this section we show that such a scenario cannot handle 
TRSs which cannot already be handled by the dependency pair technique di- 
rectly. In short, dummy elimination is useless for automated termination proofs. 
We proceed in a stepwise manner. First we relate the dependency pairs of TZ to 
those of dummy (7^). 

Lemma 1. If ^ G DP(T^) then cap(l)** ^ cap(t)** G DP(dummy(7^)). 

Proof. In the proof of Theorem 4 we observed that there exists a rewrite rule 
cap(^) — > u in dummy(T^) with cap(f) < u. Since root(cap(f)) is a defined symbol 
in dummy(T^), cap(l)** ^ cap(t)** is a dependency pair of dummy(T^). □ 

Now we prove that reducibility in TZ implies reducibility in dummy(T^). 

Definition 3. Given a substitution a, the substitution (Jcap is defined as capocr 
(i.e., the composition of cap and a where a is applied first). 

Lemma 2. For all terms t and substitutions a, we have cap{ta) = cap(t)(Tcap- 
Proof. Easy induction on the structure of t. □ 

Lemma 3. If s t then cap(s) cap(t). 

Proof. It is sufficient to show that s — t implies cap(s) ^duni„iy( 7 ^) cap(t). 
There must be a rule I ^ r G TZ and a position p such that s|.n. = la and 
t = s[ra]p. If p is below the position of an occurrence of e, then we have 
cap(s) = cap(t). Otherwise, cap(s)|p = cap{la) = cap(^)(Jcap by Lemma 2. Thus, 
cap(s) ^dummyCR) cap(s) [cap(r)(Jcap]p = cap(s)[cap(r(r)]p = cap(t). □ 

Next we show that if there is an arrow between two dependency pairs in 
the dependency graph of TZ then there is an arrow between the corresponding 
dependency pairs in the dependency graph of dummy (7?.). 

Lemma 4. Let s, t be terms with defined root symbols. If s^a t‘^a for some 
substitution a, then cap(s)*<Tcap ^dummyCR) cap(7)*(Tcap. 

Proof. Let s = /(si , . . . , s„). We have s^a = /^(si<j, . . . , s„(j). Since is a 
constructor, no step in the sequence s^a t^a takes place at the root position 

and thus t^ = f^ti, . . . ,tn) with sia — tia for all 1 ^ z ^ n. We obtain 
cap(si)(Jcap = cap(si(r) cap(ticr) = cap(7i)(Jcap for all 1 < z < n by 

Lemmata 2 and 3. Hence cap(s)l*(Tcap ^dummyCR) cap(f)*(Jcap. □ 
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Finally we are ready for the main theorem of this section. 

Theorem 5. Let TZ be a TRS. If dummy (JZ) is DP simply terminating then TZ 
is DP simply terminating. 

Proof. Let C be a cluster in the dependency graph of TZ. From Lemmata 1 and 4 
we infer the existence of a corresponding cluster, denoted by dummy(C), in 
the dependency graph of dummy (7^). By assumption, there exists an argument 
filtering n' and a simplification order □ such that 7r'(dummy(T^) Udummy(C)) C 
□ and 7r'(dummy(C)) n □ yf 0. Let T be the signature of TZ. We define an 
argument filtering tt for as follows: 7 t(/) = 7t'(/) for every f G {iF \ {e})**, 
7r(e) = [] and, if e is a defined symbol of TZ, 7r(e**) = []. Slightly different from 
the proof of Theorem 4, let (•)' denote the mapping that just replaces every 
occurrence of the constant e by o and every occurrence of by o**. It is easy to 
show that Tr{ty = 7r'(cap(f)) for every term t G T{iF, V) and Tr{t^y = 7r'(cap(t)*) 
for every term t G T (IF, V) with a defined root symbol. Similar to Theorem 4, 
we define the simplification order on as s t if and only if s' □ t' . 
We claim that tt and satisfy the constraints for C, i.e., ^{TZ U C) C F and 
7r(dummy(C)) F>- tZi. li I ^ r G TZ, then cap(^) ^ cap(r) G dummy(T^) and 
thus 7 t(I)' = 7r'(cap(^)) □ 7r'(cap(r)) = 7r(r)'. Hence t:{1) F 7r(r). If ^ t* G C, 
then cap(^)* ^ cap(t)* G dummy(C) by Lemma I and thus 7 t(F)' = 7r'(cap(l)*) □ 
7r'(cap(t)**) = 7r(t**)'. Hence 7 t(^**) F 7r(f*) and if 7r'(cap(^)*) □ 7r'(cap(t)**), then 
Tt{ 1'^) >- TT(t'^). □ 

We stress that the proof is constructive in the sense that a DP simple termi- 
nation proof of dummy(T^) can be automatically transformed into a DP simple 
termination proof of TZ (i.e., the orders and argument filterings required for the 
DP simple termination proofs of dummy(T^) and TZ are essentially the same). 
Thus, the termination proof of dummy(T^) is not simpler than a direct proof for 
TZ. 

Theorem 5 also holds if one uses the estimated dependency graph of [1,2,3] 
instead of the real dependency graph. As mentioned in Section 2, such a com- 
putable approximation of the dependency graph must be used in implementa- 
tions, since constructing the real dependency graph is undecidable in general. 
The proof is similar to the one of Theorem 5, since again for every cluster in the 
estimated dependency graph of TZ there is a corresponding one in the estimated 
dependency graph of dummy(T^). 



4 Argument Filtering Transformation 

By incorporating argument filterings, a key ingredient of the dependency pair 
technique, into dummy elimination, Kusakari, Nakamura, and Toyama [16] re- 
cently developed the argument filtering transformation. In their paper they 
proved the soundness of their transformation and they showed that it improves 
upon dummy elimination. In this section we compare their transformation to 
the dependency pair technique. We proceed as in the previous section. First we 
recall the definition of the argument filtering transformation. 
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Definition 4. Let it be an argument filtering, f a function symbol, and 1 ^ z ^ 
arity(/). We write f _L^ i if neither i G 7t(/) nor i = 'ir(f). Given two terms s 
and t, we say that s is a preserved subterm of t with respect to tt and we write 
s t, if s <t and either s = t or t = f{t\, . . . ,tn), s is a preserved subterm of 
ti, and f f-Tt i. 

Definition 5. Given an argument filtering it, the argument filtering it is defined 
as follows: 



-/PN /’’■(/) ifTT{f) = [ii,...,im], 

IM/)] = 

The mapping AFT^ assigns to every term in T{T,V) a subset o/T(iF^, V), as 
follows: 

AFT^(t) = {7r(t) I 7f(t) contains a defined symbol} U AFT^(s) 

ses 

with S denoting the set of outermost non-preserved subterms of t. Finally, we 
define 

AFT^(T^) = {tt{ 1) ^r'\l^rGTZ and r' G AFT^(r) U {7r(r)}}. 

Consider the term t of Figure 1. Figure 2 shows AFT^(t) for the two argument 
filterings with 7r(e) = [1] and 7r(e) = 2, respectively, and 7t(/) = for 

every other n-ary function symbol /. Here we assume that all numbered contexts 
contain defined symbols, but no occurrence of e. 











\ = 7r(t) 


/ 1 


\ 
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e 
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e 

1 
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/4\ 




► = AFT,r(t) = < 


/5\ 




A 

e 










1 

V. 


A 




/4\ 





7 r(e) = [1] 7r(e) — 2 

Fig. 2. The mappings tt and AFT,r- 



So essentially, AFT,r(i) contains 7t(s) for s = t and for all (maximal) sub- 
terms s of t which are eliminated if the argument filtering tt is applied to t. 
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However, one only needs terms 7t(s) in AFT^(t) where s contained a defined 
symbol outside eliminated arguments (otherwise the original subterm s can- 
not have been responsible for a potential non-termination). Kusakari et al. [11] 
proved the soundness of the argument filtering transformation. 

Theorem 6. // AFT.„.(7^) is terminating then TZ is terminating. 

We show that if AFT.„. {TV) is simply terminating then TZ is DP simply termi- 
nating and again, a termination proof by dependency pairs works with the same 
argument filtering tt and the simplification order used to orient AFT,r(^)- Thus, 
the argument filtering transformation has no advantage compared to dependency 
pairs. We start with two easy lemmata.^ 

Lemma 5. Let s and t be terms. If s t then tt{s) < 7r(t). 

Proof. By induction on the definition of If s = t then the result is trivial. 
Suppose t = f{t\, . . .,tn), s ti, and / i. The induction hypothesis yields 
7t(s) < n{ti). Because / i, T^{ti) is a subterm of n{t) and thus 7t(s) < n{t) as 
desired. □ 



Lemma 6. Let r be a term. For every subterm t of r with a defined root symbol 
there exists a term u G AFT.„.(r) such that 7r(t) < u. 

Proof. We use induction on the structure of r. In the base case we must have 
t = r and we take u = 7r(r). Note that 7r(r) G AFT,r(?") because root(7r(r)) = 
root(r) is defined. In the induction step we distinguish two cases. If t r then 
we also have t r and hence Tiff) < 7r(r) by Lemma 5. As root(7r(t)) = root(t) 
is defined, the term 7r(r) contains a defined symbol. Hence 7r(r) G AFT,r(?’) by 
definition and thus we can take u = 7r(r). In the other case t is not a preserved 
subterm of r. This implies that t < s for some outermost non-preserved subterm 
s of r. The induction hypothesis, applied to t < s, yields a term u G AFT.„.(s) 
such that Tr{t) < u. We have AFT,r(s) C AFT.„.(r) and hence u satisfies the 
requirements. □ 



Theorem 7. Let TZ be a TRS and it an argument filtering. If AFT tt{TZ) is simply 
terminating then TZ is DP simply terminating. 

Proof. Like in the proof of Theorem 4 there is no need to consider the dependency 
graph. Let be a simplification order that shows the (simple) termination of 
AFT,r('^)- We claim that the dependency pair constraints are satisfied by tt and 
y , where tt and are extended to !F'^ by treating each marked symbol F in the 
same way as the corresponding unmarked /. For rewrite rules ^ > r G 7^ we have 

tt{1) >- 7r(r) as tt{1) 7r(r) G AFT,r(7^)- Let l'^ ^ t** be a dependency pair of TZ, 

originating from the rewrite rule I — > r. We show that tt{1) >- irff) and hence, 
tt{ 1'^) >- Tr{t^) as well. We have t < r. Since root(t) is a defined function symbol 

^ Argumentations similar to the proofs of Lemma 6 and Theorem 7 can also be found 
in [16, Lemma 4.3 and Theorem 4.4]. However, [16] contains neither Theorem 7 
nor our main Theorem 8, since the authors do not compare the argument filtering 
transformation with the dependency pair approach. 
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by the definition of dependency pairs, we can apply Lemma 6. This yields a term 
u G AFT,r(?’) such that 7r(t) < u. The subterm property of yields u > 7r(t). 
By definition, tt{1) ^ u G AFT,r('7^) and thus n{l) y u hy compatibility of 
with AFT,r('7^)- Hence tt{1) >- 7r(t) as desired. □ 

Note that in the above proof we did not make use of the possibility to treat 
marked symbols differently from unmarked ones. This clearly shows why the 
dependency pair technique is much more powerful than the argument filtering 
transformation; there are numerous DP simply terminating TRSs which are no 
longer DP simply terminating if we are forced to interpret a defined function 
symbol and its marked version in the same way. As a simple example, consider 

( X — 0 ^ X 0-G s{y) — > 0 

7^l = < X- s{y) p(x - y) s{x) -G s(y) ^ s((a; - y) -G s(y)) 

[ P(s(a;)) ^ X 




Note that TZi is not simply terminating as the rewrite step s(a;) -G s(s(a;)) — > 
s((x— s(a;))-Gs(s(a;))) is self-embedding. To obtain a terminating TRS AFT,r('7^i), 
the rule p(s(a;)) ^ x enforces p 1 and s 1- From p 1 and the rules for — 
we infer that 7t(— ) = [1, 2]. But then, for all choices of 7t(-g), the rule s{x)^s{y) —> 
s((a; — y)-Gs(y)) is transformed into one that is incompatible with a simplification 
order. So AFT.„.(7^i) is not simply terminating for any tt. (Similarly, dummy 
elimination cannot transform this TRS into a simply terminating one either.) On 
the other hand, DP simple termination of TZi is easily shown by the argument 
filtering 7r(p) = 1, 7t(-) = 1, 7t(-#) = [1,2], and 7t(/) = [1,... , arity(/)j for 
every other function symbol / in combination with the recursive path order. 
This example illustrates that treating defined symbols and their marked versions 
differently is often required in order to benefit from the fact that the dependency 
pair approach only requires weak decreasingness for the rules of TZi. 

The next question we address is whether the argument filtering transforma- 
tion can be useful as a preprocessing step for the dependency pair technique. 
Surprisingly, the answer to this question is yes. Consider the TRS 

r f(a) ^ f(c(a)) f(a) ^ f(d(a)) e(g(a;)) -> e(a;) 

T^2 = \ f(c(a;)) ^ X f(d(a;)) ^ x 

[ f(c(a)) -> f(d(b)) f(c(b)) ^ f(d(a)) 

This TRS is not DP simply terminating which can be seen as follows. The 
dependency pair E(g(a;)) ^ E(a;) constitutes a cluster in the dependency graph 
of 7^2- Hence, if 7^.2 were DP simply terminating, there would be an argument 
filtering tt and a simplification order such that (amongst others) 



7T(f(a)) ^ 7r(f(c(a))) 7r(f(a)) ^ 7r(f(d(a))) 

7r(f(c(a;))) ^ x 7r(f(d(a;))) ^ x 

7T(f(c(a))) ^ 7r(f(d(b))) 7r(f(c(b))) ^ 7r(f(d(a))) 

From 7r(f(c(a;))) ^ x and 7r(f(d(a;))) ^ a; we infer that f !> c 1, and 
d 1- Hence 7r(f(a)) ^ 7r(f(c(a))) and 7r(f(a)) ^ 7r(f(d(a))) can only be satisfied 
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if 7t(c) = 7r(d) = 1. But then 7r(f(c(a))) ^ 7r(f(d(b))) and 7r(f(c(b))) ^ 7r(f(d(a))) 
amount to either f(a) ^ f(b) and f(b) A f(a) (if 7r(f) = [1]) or a ^ b and b ^ a 
(if 7r(f) = 1). Since f(a) ^ f(b) and a 7^ b the required simplification order does 
not exist. 

On the other hand, if 7r(e) = 1 then AFT.„.(7^2) consists of the first six rewrite 
rules of TZ together with g(a;) — > x. One easily verifies that there are no clusters 
in DG(AFT.„.(7^2)) and hence AFT.„.(7^2) is trivially DP simply terminating. 

Definition 6. An argument filtering tt is called collapsing t/7r(/) = i for some 
defined function symbol f. 

The argument filtering in the previous example is collapsing. In the remainder 
of this section we show that for non-collapsing argument filterings the implication 
“AFT.„.(7^) is DP simply terminating 7^ is DP simply terminating” is valid. 
Thus, using the argument filtering transformation with a non-collapsing tt as a 
preprocessing step to the dependency pair technique has no advantages. 

First we prove a lemma to relate the dependency pairs of TZ and AFT.„.(7^). 

Lemma 7. Let tt he a non- collapsing argument filtering. If ^ G DP(T^) 
then 7t(0“ ^ 7r(t)t‘ e DP(AFT,,(7^)). 

Proof. By definition there is a rewrite rule I r G TZ and a subterm t < r with 
defined root symbol. According to Lemma 6 there exists a term u G AFT.„.(r) 
such that n{t) < u. Thus, tt{1) ^ u G AFT,r('7^)- Since tt is non-collapsing, 
root(7r(t)) = root(t). Hence, as root(t) is defined, n{l)^ n{tY is a dependency 
pair of AFT.„. (7^) . □ 

Example 7^2 shows that the above lemma is not true for arbitrary argument 
filterings. The reason is that e(g(a;))** ^ e(a;)** is a dependency pair of TZ, but 
with 7r(e) = 1 there is no corresponding dependency pair in AFT.„.(7^). 

The next three lemmata will be used to show that clusters in DG(T^) corre- 
spond to clusters in DG(AFT,r('^))- 

Definition 7. Given an argument filtering it and a substitution a, the substitu- 
tion is defined as tt o a (i.e., a is applied first). 



Lemma 8. For all terms t, argument filterings tt, and substitutions a, Trfta) = 
Tr{t)a^. 

Proof. Easy induction on the structure of t. □ 

Lemma 9. LetTZ he a TRS and tt a non-collapsing argument filtering. If s t 

then 7t(s) ^aft„Cr) 

Proof. It suffices to show that 7t(s) ^aft„cr) whenever s — t consists of 
a single rewrite step. Let s = C[la] and t = C[ra] for some context C, rewrite 
rule I ^ r € TZ, and substitution a. We use induction on G. If G is the empty 
context, then 7t(s) = Tr{la) = 7r(^)<T,r and 7r(t) = Tr{ra) = 7r(r)(j.n. according to 
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Lemma 8. As n{l) 7r(r) G AFT^(T^), we have 7r(s) ^aft^Cr) Suppose 
C = f{si, . . . ,C , . . . , Sn) where C is the z-th argument of C. If / z then 
7 t(s) = 7r(t). If 7 t(/) = z (which is possible for constructors /) then 7 t(s) = 
TT{C'[la]) and 7r(t) = TT{C'[ra]), and thus we obtain 7 t(s) (r) ’’’(^) from the 

induction hypothesis. In the remaining case we have 7 t(/) = [zi, . . . , z^] with ij = 
z for some j and hence 7 t(s) = /(Tr(sij), . . . , 7 t(C"[^(t]), . . . , 7r(si^)) and 7r(t) = 
f{n{si^), . . . ,Tr{C'[ra]), . . .,Tr{si^)). In this case we obtain 7 t(s) n{t) 

from the induction hypothesis as well. □ 

The following lemma states that if two dependency pairs are connected in TZ’s 
dependency graph, then the corresponding pairs are connected in the dependency 
graph of AFT^(T^) as well. 

Lemma 10. Let TZ be a TRS, it a non- eollap sing argument filtering, and s, t 
be terms with defined root symbols. If s^a fa for some substitution a then 

7t(s)^(J7t. ^AFT.„(R) 

Proof. We have s = /(si , . . . , Sn) and t = f{t \ , . . . , t„) for some zz-ary defined 
function symbol / with Sia Act for all 1 ^ z ^ zz. Let 7 t(/) = [ii, ... , z^]. 
This implies 7 t(sct)** = /t*(7r(sijCT), . . . , 7r(si,„CT)) and Tr(tCT)** = /^(Tr(tijCT), . . . , 
7 t(A„,ct)). From the preceding lemma we know that Tr(sijCT) ^aft (r) 
for all 1 ^ j ^ zzz. Hence, using Lemma 8, 7r(s)**CT.n. = 7 t(sct)** ^aft (r) = 

n{tYaTr. □ 

Now we can finally prove the main theorem of this section. 

Theorem 8. Let TZ be a TRS and it a non- eollap sing argument filtering. If 
AFT.„.(7^) is DP simply terminating then TZ is DP simply terminating. 

Proof. Let C be a cluster in DG(T^). According to Lemmata 7 and 10, there is a 
corresponding cluster in DG(AFT^(7^)), which we denote by 7 t(C). By assump- 
tion, there exist an argument filtering tt' and a simplification order such that 
7r'(AFT^(7^)U7r(C)) C ^ and Tr'{Tr{C))n>- yf 0. We define an argument filtering 
tt" for TZ as the composition of tt and tt'. For a precise definition, let b denote 
the unmarking operation, i.e., /^ = / and = / for all f £ TF. Then for all 
/ € IF# we define 

( • ■ ■ > aJ fr = [h> ■ ■ ■ Gm] and n'{f) = [ji, . . .,jk], 

= S A fr = [*1: ■ ■ ■ Gm] and 7 t'(/) = j, 

[z if 7 t(/) = z. 

It is not difficult to show that 7r"(t) = 7r'(7r(t)) and n'fif) = 7r'(7r(t)#) for all 
terms t without marked symbols. We claim that tt" and satisfy the constraints 
for C, i.e., tt''{TZLIC) C ^ and n''{C)r\y yf 0. These two properties follow from the 
two assumptions 7 t'(AFT^(7^) U 7t(C)) C ^ and 7r'(7r(C)) n yf 0 in conjunction 
with the obvious inclusion tt(TZ) C AFT^(77.). □ 

Theorem 8 also holds for the estimated dependency graph instead of the real 
dependency graph. 
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5 Conclusion 

In this paper, we have compared two transformational techniques for termination 
proofs, viz. dummy elimination [11] and the argument filtering transformation 
[16], with the dependency pair technique of Arts and Giesl [1,2,3]. Essentially, all 
these techniques transform a given TRS into new inequalities or rewrite systems 
which then have to be oriented by suitable well-founded orders. Virtually all well- 
founded orders which can be generated automatically are simplification orders. 
As our focus was on automated termination proofs, we therefore investigated the 
strengths of these three techniques when combined with simplification orders. 

To that end, we showed that whenever an automated termination proof is 
possible using dummy elimination or the argument filtering transformation, then 
a corresponding termination proof can also be obtained by dependency pairs. 
Thus, the dependency pair technique is more powerful than dummy elimination 
or the argument filtering transformation on their own. 

Moreover, we examined whether dummy elimination or the argument fil- 
tering transformation would at least be helpful as a preprocessing step to the 
dependency pair technique. We proved that for dummy elimination and for an 
argument filtering transformation with a non-collapsing argument filtering, this 
is not the case. In fact, whenever there is a (pre)order satisfying the dependency 
pair constraints for the rewrite system resulting from dummy elimination or a 
non-collapsing argument filtering transformation, then the same (pre)order also 
satisfies the dependency pair constraints for the original TRS. 

As can be seen from the proofs of our main theorems, this latter result 
even holds for arbitrary (i.e., non-simplification) (pre)orders. Thus, in particular. 
Theorems 5 and 8 also hold for DP quasi-simple termination [13]. This notion 
captures those TRSs where the dependency pair constraints are satisfied by 
an arbitrary simplification preorder ^ (instead of just a preorder ^ where the 
equivalence relation is syntactic equality as in DP simple termination). 

Future work will include a further investigation on the usefulness of collaps- 
ing argument filtering transformations as a preprocessing step to dependency 
pairs. Note that our counterexample TZ^ is DP quasi-simply terminating (but not 
DP simply terminating). In other words, at present it is not clear whether the 
argument filtering transformation is useful as a preprocessing step to the depen- 
dency pair technique if one admits arbitrary simplification preorders to solve the 
generated constraints. However, an extension of Theorem 8 to DP quasi-simple 
termination and to eollapsing argument filterings tt is not straightforward, since 
clusters of dependency pairs in TZ may disappear in AFT.„.(7^) (i.e., Lemma 7 
does not hold for collapsing argument filterings). We also intend to examine the 
relationship between dependency pairs and other transformation techniques such 
as “freezing” [20]. 
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Abstract. Families of function definitions and conjectures based in 
quantifier- free decidable theories are identified for which inductive va- 
lidity of conjectures can be decided by the cover set method, a heuristic 
implemented in a rewrite-based induction theorem prover Rewrite Rule 
Laboratory (RRL) for mechanizing induction. Conditions characterizing 
definitions and conjectures are syntactic, and can be easily checked, thus 
making it possible to determine a priori whether a given conjecture can 
be decided. The concept of a T -based function definition is introduced 
that consists of a finite set of terminating complete rewrite rules of the 
form f{si,---,Sm) r, where si,---,Sm are interpreted terms from 
a decidable theory T, and r is either an interpreted term or has non- 
nested recursive calls to / with all other function symbols from T. Two 
kinds of conjectures are considered. Simple conjectures are of the form 
f{xi, ■ ■ ■ Xm) = t, where / is T-based, xLs are distinct variables, and t is 
interpreted in T. Complex conjectures differ from simple conjectures in 
their left sides which may contain many function symbols whose defini- 
tions are T -based and the nested order in which these function symbols 
appear in the left sides have the compatibility property with their defini- 
tions. 

The main objective is to ensure that for each induction subgoal gener- 
ated from a conjecture after selecting an induction scheme, the resulting 
formula can be simplified so that induction hypothesis(es), whenever 
needed, is applicable, and the result of this application is a formula in 
T. Decidable theories considered are the quantifier-free theory of Pres- 
burger arithmetic, congruence closure on ground terms (with or with- 
out associative-commutative operators), propositional calculus, and the 
quantifier-free theory of constructors (mostly, free constructors as in the 
case of finite lists and finite sequences). A byproduct of the approach is 
that it can predict the structure of intermediate lemmas needed for au- 
tomatically deciding this subclass of conjectures. Several examples over 
lists, numbers and of properties involved in establishing the number- 
theoretic correctness of arithmetic circuits are given. 

* Partially supported by the National Science Foundation Grant nos. CCR-9712396, 
CCR-9712366, CCR-9996150, and CDA-9503064. 



D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 324-345, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Extending Decision Procedures with Induction Schemes 



325 



1 Introduction 

Inductive reasoning is ubiquitous in verifying properties of computations realized 
in hardware and software. Automation of inductive reasoning is hampered by 
the fact that proofs by induction need an appropriate selection of variables for 
performing induction and a suitable induction scheme, as well as intermediate 
lemmas. It is well-known that inductive reasoning often needs considerable user 
guidance, because of which its automation has been a major challenge. A lot of 
effort has been spent on mechanizing induction in theorem provers (e.g., Nqthm, 
ACL2, RRL, INK A, Oyster- Clam), and induction heuristics in these provers 
have been successfully used to establish several nontrivial properties. However, 
the use of induction as a mechanized rule of inference is seriously undermined 
due to the lack of automation in using this rule. Many reasoning tools including 
model checkers (in conjunction with decision procedures), invariant generators, 
deductive synthesis tools preclude induction for lack of automation. This severely 
limits the reasoning capability of these tools. In many cases inductive properties 
are established outside these tools manually. 

For hardware circuit descriptions, need for inductive reasoning arises when 
reasoning is attempted about a circuit description parameterized by data width 
and/or generic components. In protocol verification, induction is often needed 
when a protocol has to be analyzed for a large set of processors (or network 
channels). Inductive reasoning in many such cases is not as challenging as in 
software specifications as well as recursive and loop programs. 

This paper is an attempt to address this limitation of these automated tools 
while preserving their automation, and without having the full generality of a 
theorem prover. It is shown how decision procedures for simple theories about 
certain data structures, e.g., numbers, booleans, finite lists, finite sequences, can 
be enhanced to include induction techniques with the objective that proofs em- 
ploying such techniques can be done automatically. The result is an extended 
decision procedure with a built-in induction scheme, implying that an induc- 
tive theorem prover can be run in push-button mode as well. We believe the 
proposed approach can substantially enhance the reasoning power of tools built 
using decision procedures and model-checkers without losing the advantage of 
automation. 

This cannot be done, however, in general. Conditions are identified on func- 
tion definitions and conjectures which guarantee such automation. It becomes 
possible to determine a priori whether a given conjecture can be decided au- 
tomatically, thus predicting the success or failure of using a theorem proving 
strategy. That is the main contribution of the paper. A byproduct of the pro- 
posed approach is that in case of a failure of the theorem proving strategy, it can 
predict for a subclass of conjectures, the structure of lemmas needed for proof 
attempts to succeed. 

The proposed approach is based on two main ideas: First, terminating re- 
cursive definitions of function symbols as rewrite rules oriented using a well- 
founded ordering, can be used to generate induction schemes providing useful 
induction hypotheses for proofs by induction; this idea is the basis for the cover 
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set method proposed in [16] and implemented in a rewrite-rule based induction 
theorem prover Rewrite Rule Laboratory (RRL) [14]. Second, for inductive proofs 
of conjectures satisfying certain conditions, induction schemes generated from 
T -based recursive function definitions (a concept characterized precisely below), 
lead to subgoals in T (after the application of induction hypotheses), where T is 
a decidable theory. These conditions are based on the structure of the function 
definitions and the conjecture. 

The concept of a T -based function definition is introduced to achieve the 
above objective. It is shown that conjectures of the form = r, 

where / has a T-based function definition, Xi’s are distinct variables, and r is 
an interpreted term in T (i.e., r includes only function symbols from T), can 
be decided using the cover set method^. The reason for focusing on such simple 
conjectures is that there is only one induction scheme to be considered by the 
cover set method. It might be possible to relax this restriction and consider more 
complicated conjectures insofar as they suggest one induction scheme and the 
induction hypothesis(es) is applicable to the subgoals after using the function 
definitions for simplification. 

Decidable theories considered are the quantifier-free theory of Presburger 
arithmetic, congruence closure on ground terms (with or without associative- 
commutative operators), propositional calculus as well as the quantifier-free 
theory of constructors (mostly free constructors as in the case of finite lists 
and finite sequences). For each such theory, decision procedures exist, and RRL, 
for instance, has an implementation of them integrated with rewriting [7,8]. 

Below, we review two examples providing an overview of the proposed ap- 
proach, the subclass of conjectures and definitions which can be considered. 

1.1 A Simple Conjecture 

Consider the following very simple but illustrative example. 

(Cl): double(m) = m + m, 

where double is recursively defined using the rewrite rules: 

1. double (0) — > 0, 

2. double (s(x)) — > s (s (double (x) )) . 

A proof by induction with m as the induction variable and using the standard 
induction scheme (i.e., Peano’s principle of mathematical induction over num- 
bers), leads to one basis goal and one induction step goal. For the basis subgoal, 
the substitution m <- 0 gives double (0) =0 + 0 which simplifies using the 
definition of double to a valid formula in Presburger arithmetic. 

In the step subgoal, the conclusion generated using substitution m <- s (x) 
is double (s(x)) = s(x) + s(x), with the induction hypothesis got by the sub- 
stitution m <- X, being double (x) = x + x. 

^ As will be shown later, it is not necessary to require that each argument to / be 
a distinct variable; instead, non-induction arguments can be interpreted terms that 
do not include variables in inductive positions of /. 
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By the second rule in the definition of double, the formula simplifies, the 
induction hypothesis applies, resulting again in a valid formula s(s(x + x)) = 
s(x) + s(x) in Presburger arithmetic. Hence, (Cl) is valid using Presburger 
arithmetic and the induction scheme of double. 

Similarly, a conjecture double (m) = m can be decided to be false: the basis 
case will go through, but the formula resulting from the induction step and the 
application of the induction hypothesis is not valid. 

The main features of the above conjectures and the definition of double are: 

1. unambiguous induction scheme using which the formula can be decided, 

2. induction hypotheses are strong enough to be applicable to induction sub- 
goals, and finally 

3. the formulas resulting from subgoals after applying the definition and the 
induction hypotheses are decidable. 

Properties of T-based definitions and simple conjectures ensure the above. 



1.2 A Complex Conjecture 

For considering complex conjectures including many function symbols with T- 
based definitions, it becomes necessary to consider the interaction among their 
definitions based on their nesting order in conjectures. This aspect is captured by 
the compatibility property of function definitions (which is precisely characterized 
in a later section). The key insight is similar to the one observed of a simple 
conjecture. Compatible function definitions can be viewed as composing into a 
single T-based function definition so that a complex conjecture can be viewed 
as being a simple conjecture in terms of the composed function as illustrated 
below. 



(C2) : log(exp2 (m) ) = m. 

The definitions of the functions log, exp2 (logarithm and exponentiation to the 
base 2 respectively) are as follows. Following the mathematical convention, log 
is defined on positive numbers only^. 



1. log(s(0)) 


— > 0, 


2. log(x + x) 


— > s(log(x)) , 


3. log(s(x + x)) 


— > s(log(x)) . 


4. exp2(0) 


1 

1 

V 

M 

O 


5. exp2(s(x)) 


— > exp2(x) + exp2(x) 



Unlike a simple conjecture, the left side of (C2) is a nested term. Again, there 
is only one induction variable m, and the induction scheme used for attempting 

^ This implies that the induction schemes generated using the definition of log can be 
used to decide the validity of conjectures over positive numbers only. For a detailed 
discussion of the use of cover sets and induction schemes derived from definitions 
such as log, please refer to [9]. 
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a proof by induction is the principle of mathematical induction for numbers 
(suggested by the cover set of exp2, as explained later in section 2.1). 

There is one basis goal and one induction step subgoal. The basis subgoal is 
log(exp2 (0) ) = 0. The left side rewrites using the definitions of exp2 and then 
log, resulting in a valid formula in Presburger arithmetic. 

In the induction step, the conclusion is log(exp2 (s (x) ) ) = s(x) with the 
hypothesis being log(exp2 (x) ) = x. By the definition of exp2, exp2 (s (x) ) sim- 
plifies to exp2 (x) + exp2 (x) . This subgoal will simplify to a formula in Pres- 
burger arithmetic if log(exp2 (x) + exp2(x)) rewrites to s (log(exp2 (x) ) ) ei- 
ther as a part of the definition of log or as an intermediate lemma, and then, 
the induction hypothesis can apply. Such interaction between the definitions of 
log and exp2 is captured by compatibility. Since the definition of log includes 
such a rule, the validity of the induction step case and hence the validity of (C2) 
can be decided. 

The validity of a closely related conjecture, 

(C2’): exp2(log(m)) = m, 

can be similarly decided since exp2 is compatible with log. An induction proof 
can be attempted using m as the induction variable as before. However, m can 
take only positive values since the function log is defined only for these. The 
induction scheme used is different from the principle of mathematical induction. 
Instead, it is based on the definition of log. There is a basis case corresponding 
to the number s (0) , and two step cases corresponding to m being a positive even 
or a positive odd number respectively (this scheme is derived from the cover set 
of log as explained later in section 2.1). 

In one of the induction step subgoals, the left side of the conclusion, 
exp2(log(s(x + x))) = s(x + x), rewrites by the definition of log to 
exp2 (s (log(x) ) ) which then rewrites to exp2(log(x)) + exp2(log(x)) to 
which the hypothesis exp2(log(x)) = x applies to produce the inconsistent 
Presburger arithmetic formula x + x = s(x + x). 

As stated above, if log and exp2 are combined to develop the definition 
of the composed function log(exp2(x)) from their definitions, then (C2) is a 
simple conjecture about the composed function. Further, the definition of the 
composed function can be proved to be T-based as well. So the decidability of 
the conjecture follows. 

The notion of compatibility among the definitions of function symbols can be 
generalized to a compatible sequence of function symbols fi, - ■ ■ , fd where each 
fi is compatible with /i+i at ji-th argument, l<z<d— l.A conjecture I = r 
can then be decided if the sequence of function symbols from the root of I to the 
innermost function symbol forms a compatible sequence, and r is an interpreted 
term in T. 

The proposed approach is discussed below in the framework of our theorem 
prover Rewrite Rule Laboratory (RRL), but the results should apply to other 
induction provers that rely on decision procedures and support heuristics for 
selecting induction schemes, e.g., Boyer and Moore’s theorem prover Nqthm, 
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ACL2, and INK A. And, the proposed approach can be integrated in tools based 
on decision procedures and model checking. 

The main motivation for this work comes from our work on verifying proper- 
ties of generic, parameterized arithmetic circuits, including adders, multipliers, 
dividers and square root [12,10,11,13]. The approach is illustrated on several 
examples including properties arising in proofs of arithmetic circuits, as well as 
commonly used properties of numbers and lists involving defined function sym- 
bols. A byproduct of this approach is that if a conjecture with the above men- 
tioned restriction cannot be decided, structure of intermediate lemmas needed 
for deciding it can be predicted. This can aid in automatic lemma speculation. 

1.3 Related Work 

Boyer and Moore while describing the integration of linear arithmetic into Nqthm 
[3] discussed the importance of reasoning about formulas involving defined func- 
tion symbols and interpreted terms. Many examples of such conjectures were 
discussed there. They illustrated how these examples can be done using the in- 
teraction of the theorem prover and the decision procedure. In this paper we 
have focussed on automatically deciding the validity of such conjectures. Most 
of the examples described there can be automatically decided using the proposed 
approach. 

Fribourg [4] showed that properties of certain recursive predicates over lists 
expressed as logic programs along with numbers, can be decided. Most of the 
properties established there can be formulated as equational definitions and de- 
cided using the proposed approach. The procedure in [4] used bottom-up eval- 
uation of logic programs which need not terminate if successor operation over 
numbers is included. The proposed approach does not appear to have this limi- 
tation. 

Gupta’s dissertation [1] was an attempt to integrate (a limited form of) 
inductive reasoning with a decision procedure for propositional formulas, e.g., 
ordered BDDs. She showed how properties about a certain subclass of circuits 
of arbitrary data width can be verified automatically. Properties automatically 
verified using her approach constitute a very limited subset, however. 

2 Cover Set Induction 

The cover set method is used to mechanize well-founded induction in RRL, 
and has been used to successfully perform proofs by induction in a variety of 
nontrivial application domains [12,10,11]. For attempting a proof by induction of 
a conjecture containing a subterm t = f{x\, • • • , Xm), where each Xi is a distinct 
variable, an induction scheme from a complete definition of / given as a set of 
terminating rewrite rules, is generated as follows. There is one induction subgoal 
corresponding to each terminating rule in the definition of /. The induction 
conclusion is generated using the substitution from the left side of a rule, and 
an induction hypothesis is generated using the substitution from each recursive 
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function call in the right side of the rule. Rules without recursive calls in their 
right sides lead to subgoals without any induction hypotheses (basis steps). 

The recursive definitions of function symbols appearing in a conjecture can 
thus be used to come up with an induction scheme. Heuristics have been de- 
veloped and implemented in RRL, which in conjunction with failure analysis 
of induction schemes and backtracking in case of failure, have been found ap- 
propriate for prioritizing induction schemes, automatically selecting the “most 
appropriate” induction scheme (thus selecting induction variables), and gener- 
ating the proofs of many conjectures. 

2.1 Definitions and Notation 

Let T{F,X) denote a set of terms where F is a finite set of function symbols 
and X is a set of variables. A term is either a variable x G X, or a, function 
symbol f G F followed by a finite sequence of terms, called arguments of /. Let 
Vars{t) denote the variables appearing in a term t. The subterms of a term are 
the term itself and the subterms of its arguments. A position is a finite sequence 
of positive integers separated by ”.”’s, which is used to identify a subterm in a 
term. The subterm of t at the position denoted by the empty sequence e is t 
itself. If f{ti,---,tm) is a subterm at a position p then tj is the subterm at the 
position p.j. Let depth{t) denote the depth of t; depth{t) is 0 if t is a variable or 
a constant (denoted by a function symbol with arity 0). depth{f{t\, • • • , tm)) = 
maximum{depth{ti)) + 1 for 1 < z < m. 

A term /(ti, • • • , tm) is called basic if each ti is a distinct variable. 

A substitution 0 is a mapping from a finite set of variables to terms, denoted 
as {a;i ^ ti,---,Xm ^ tm}, m > 0, and x'^s are distinct. 9 applied on s = 
/(si, • • • , Sm) is /(6*(si), • • • , 9{sm))- Term s matches t under 9 if 6*(s) = t. Terms 
s and t unify under 9 if 6*(s) = 9{t). 

A rewrite rule s — > t is an ordered pair of terms (s, f) with Varsff) C Vars{s). 
A rule s t is applicable to a term u iff for some substitution 9 and position p 
in u, 9(s) = u\p. The application of the rule rewrites u to u[p ^ 9{t)], the term 
obtained after replacing the subterm at position p in u by 9(t). A rewrite system 
i? is a finite set of rewrite rules. R induces a relation among terms denoted —^r. 
s ^R t denotes rewriting of s to t by a single application of a rule in R. — 
and denote the transitive and the reflexive, transitive closure of -^r. 

The set F is partitioned into defined and interpreted function symbols. An 
interpreted function symbol comes from a decidable theory T. A defined function 
symbol is defined by a finite set of terminating rewrite rules, and its definition 
is assumed to be complete. Term t is interpreted if all the function symbols in it 
are from T. Underlying decidable theories are quantifier-free theories. 

A equation s = t is inductively valid (valid, henceforth) iff for each variable 
in it, whenever any ground term of the appropriate type is substituted into 
s = t, the instantiated equation is in the equational theory (modulo the decidable 
theory T) of the definitions of function symbols in s,t. This is equivalent to s = t 
holding in the initial model of the equations corresponding to the definitions and 
the decidable theory T. 
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Given a complete function definition as a finite set of terminating rewrite 
rules {li ^ Ti \ li = 1 < t < fc}, the main steps of cover set 

method are 

1. Generating a Cover Set from a Function Definition: A cover set associated 
with a function / is a finite set of triples. For a rule I r, where I = /(si, 
• • • , Sm) and f{t\, • • • , t\fi) is the recursive call to / in the right side 
r, the corresponding triple is ((si, • • • , Sm), {• • • , {t\, • • • , t^fi), ■ ■ •}, {})3. The 
second component of a triple is the empty set if there is no recursive call to 
/ in r. 

The cover sets of double , exp2 and log obtained from their definitions in 
section 1 are given below. 

Cover (double) : {<<0>, O, {]■>, «s(x)>, f<x>]-, {}■>}■, 

Cover (exp2) : {<<0>, -Q, {]■>, «s (x)> 

Cover (log) : {<<s(0)>, -Q, {]■>, 

<<x + x>, {<x>}, {}>}■, «s(x + x)>, {<x>}-, {}■>}■. 

2. Generating Induction Schemes using Cover Sets: Given a conjecture C, a 
basic term t = f{x\, • • • , Xm) appearing in C can be chosen for generating 
an induction scheme from the cover set of /. The variables in argument posi- 
tions in t over which the definition of / recurses are called induction variables 
and the corresponding positions in t are called the inductive (or changeable) 
positions; other positions are called the unchangeable positions [2] . 

An induction scheme is a finite set of induction cases, each of the form 
(ctc, {Oi}) generated from a cover set triple ((si, • • • , Sm), {• • • j {t\i • • • ) 
•••},{}) as follows CTc = {a;i ^ si,---,Xra ^ Sm}, and , 9i = {a;i ^ 

F ... 'j' ^ 1 ^ 

The induction scheme generated from the cover sets of double , exp2 is the 
principle of mathematical induction. The scheme generated from the cover 
set of log is different since the function log is defined over positive numbers 
only. There is one basis step — ({a; ^ s(0)}, {}), and ({x ^ s(0)}, {}). There 
are two induction steps — ({x <— m -I- m}, {{x ^ w}}), and ({x ^ s(m -I- 
m)}, X <— m). The variable m is a positive number. 

® The third component in a triple is a condition under which the conditional rewrite 
rule is applicable; for simplicity, we are considering only unconditional rewrite rules, 
so the third component is empty to mean that the rule is applicable whenever its left 
side matches. The proposed approach extends to conditional rewrite rules as well. 
See [16,15,13]. 

^ The variables in a cover set triple are suitably renamed if necessary. 

® To generate an induction scheme, it suffices to unify the subterm t in a conjecture 
C with the left side of each rule in the definition of / as well as with the recursive 
calls to / in the right side of the rule. This is always possible in case t is a basic 
term; but it even works if only variables in the induction positions of t are distinct. 
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3. Generating Induction Subgoals using an Induction Scheme: Each induction 
case generates an induction subgoal: CTc is applied to the conjecture to gen- 
erate the induction conclusion, whereas each substitution 6i applied to the 
conjecture generates an induction hypothesis. Basis subgoals come from in- 
duction cases whose second component is empty. 

The reader can consult examples (Cl) and (C2) discussed above, and see 
how induction subgoals are generated using the induction schemes generated 
from the cover sets of double and exp2. 

3 T-Based Definitions and Simple Conjectures 

Definition 1. A definition of a function symbol f is T-based in a decidable 
theory T iff for each rule f{t\, • • • , tm) ^ r in the definition, each U, 1 < i < m, 
is an interpreted term in T , any recursive calls to f in r only have interpreted 
terms as arguments, and the abstraction of r defined as replacing recursive calls 
to f in r by variables is an interpreted term in T® . 

For examples, the definitions of double , log and exp2 given in Section 1 are 
T-based over Presburger arithmetic. So is the definition of * given using rules, 

1. X * 0 — > 0, 

2. X * s(y) — > X + (x * y) . 

We abuse the notation slightly and call the functions themselves as being T- 
based whenever their definitions are T-based. 

In order to use T-based definitions for generating induction schemes, they 
should be complete as well as terminating over T. For a brief discussion of how 
to perform such checks, see [9,6]. It should be easy to see that terms in the cover 
set generated from a T -based function definition are interpreted in T. 

3.1 Simple Conjectures 

Definition 2. A term is T-based if it contains variables, interpreted function 
symbols from T and function symbols with T-based definitions. 

Definition 3. A conjecture f{xi,---,Xm) = r, where f has a T-based defini- 
tion, x'^s are distinct variables and r is interpreted in T, is called simple. 

Note that both sides of a simple conjecture are T-based. 

For example, the conjecture (Cl): double (m) = m + m about double is 
simple over Presburger arithmetic, whereas the conjecture (C2) : log(exp2 (m) ) 
= m about log is not simple over Presburger arithmetic. 

For a simple conjecture, the cover set method proposes only one induction 
scheme, which is generated from the cover set derived from the definition of /. 

® If r includes occurrences of cond, a special built-in operator in RRL for doing simu- 
lated conditional rewriting and automatic case analysis, then the first argument to 
cond is assumed to be an interpreted boolean term in T. 
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Theorem 4. A simple conjecture C over a decidable theory T can be decided 
using the cover set method. 



Proof. Given f{xi, • • • , Xm) = x, where r is interpreted in T, from the cover 
set associated with the definition of /, an induction scheme can be generated 
and a proof can be attempted. 

Since ac(f(xi, • • • , Xm)) = k for some rule k in the definition of /, the 
left side of a basis subgoal 



<Jc{f{xi, - ■ ■ ,x^)) =CTc(r), 

rewrites using the rule to rj, an interpreted term in T. The result is a decidable 
formula in T. This part of the proof exploits the fact that the right side of a 
simple conjecture, r, is an interpreted term in T. 

For each induction step subgoal derived from a rule Ij —>■ Vj in the definition 
of / where rj = h{- f {■■■),■■ ■), with recursive calls to /, the conclusion is 
(Jcifixi, - ■ ■ ,Xm)) = o-c(r); a-c(f(xi,---,Xm)) = Ij with 6»i(/(xi, • • • , a;™)) = 
0i{r) being an induction hypothesis corresponding to each recursive call to / 
in Vj. The left side of the conclusion simplifies by the corresponding rule to Vj 
which includes an occurrence of 6i{f{xi,---,Xm)) as a subterm at a position 
Pi in Vj. The application of these hypotheses generates the formula rj[pi ^ 
6i{r), - ■ ■ ,pk ^ 0k{i")] = (Xc{r) of T, since the abstraction of rj after recursive 
calls to / have been replaced by variables, is an interpreted term in T. 

Since every basis and induction step subgoal generated by the cover set 
method can be decided in T, the conjecture C can be decided by the cover 
set method. □ 

As the above proof suggests, a slightly more general class of simple conjec- 
tures can be decided. Not all the arguments to / need be distinct variables. It 
suffices if the inductive positions in / are distinct variables, and the other po- 
sitions are interpreted and do not contain variables appearing in the inductive 
positions. The above proof would still work. 

For example, the following conjecture 

(C3) : appendCn, nil) = n, 

is not simple. The validity of the conjecture (C3) can be decided over the theory 
of free constructors nil and cons for lists. The definition of append is 

1. append (nil, x) — > x, 

2. append(cons(x,y) ,z) — > cons (x, append (y,z) ) . 

The requirement that unchangeable positions in a conjecture do not refer to 
the induction variables, seems essential for the above proof to work, as otherwise 
the application of the induction hypotheses may get blocked. 

For example, the cover set method fails to disprove a conjecture such as 



append (m, m) = m. 




334 Deepak Kapur and Mahadavan Subramaniam 



from the definition of the function append. An inductive proof attempt based on 
the cover set of the function append results in an induction step subgoal with 
the conclusion 

append(cons(x, y) , cons(x, y)) = cons(x, y) , 

and the hypothesis append (y, y) = y. The conclusion rewrites to cons(x, 
append (y, cons(x, y))) = cons(x, y) to which the hypothesis cannot be 
applied. Therefore, the cover set method fails since the induction step subgoal 
cannot be established. 

4 Complex T-Based Conjectures 

To decide more complex conjectures by inductive methods, the choice of induc- 
tion schemes have to be limited as well as the interaction among the function 
definitions have to be analyzed. In [15], such an analysis is undertaken to pre- 
dict the failure of proof attempts a priori without actually attempting the proof. 
The notion of compatibility of function definitions, an idea illustrated in 
Section 1, is introduced for characterizing this interaction and for identifying 
intermediate steps in a proof which get blocked in the absence of additional 
lemmas. 

In this section, we use related concepts to identify conditions under which 
conjectures such as (C2), more complex than the generalized simple conjectures 
discussed in section 3, can be decided. We first consider the interaction among 
two function symbols. This is subsequently extended to consider the interaction 
among a sequence of function symbols. 

Definition 5. A T -based term t is eomposed if 

1. t is a basic term f{xi, - ■ ■ ,Xm), where f is T-based and x^s are distinct 
variables or 

2. (a) t = /(si , • • • , t', • ■ ■ j Sm), where t' is composed and is in an inductive 

position of a T-based function f, and each Si is an interpreted term, 
and 

(b ) variables Xi ’s appearing in the inductive positions of the basic subterm ( in 
the innermost position) oft do not appear elsewhere in t. Other variables 
in unchangeable positions of the basic subterm can appear elsewhere in 
t. 

For example, the left side of the conjecture (C2), log(exp2 (m) ) , is a com- 
posed term of depth 2. 

The first requirement in the above definition can be relaxed as in the case 
of simple conjectures. Only the variables in the inductive positions of a basic 
subterm in t have to be distinct; the terms interpreted in T can appear in the 
unchangeable positions of the basic subterm. 

Given a conjecture of the form I = r, where I is composed and r is interpreted, 
it is easy to see that there is only one basic subterm in it whose outermost symbol 
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is T-based. The cover set method thus suggests only one induction scheme. We 
will first consider conjectures such as (C2) in which the left side I is of depth 2; 
later, conjectures in which I is of higher depth, are considered. 

For a conjecture /(ti, • • • , g{xi, • • • , Xk), • • • , tm) = r, the interaction between 
the right sides of rules defining g and the left side of rules defining / must 
be considered, as seen in the proof of the conjecture (C2). The interaction is 
formalized below in the property of compatibility. 

Definition 6. A definition of f is compatible with a definition of g in its i- 
th argument in T iff for each right side Cg of a rule defining g, the following 
conditions hold 

1. whenever Vg is interpreted, then f{xi, • • • , r^, • • • , Xm) rewrites to an inter- 
preted term in T, and 

2. whenever Vg = h{si , • • • , g{t\, • • • , ffc), • • • , s„), having a single recursive call 
to g, the definition of f rewrites f{xi, • • • , h{si, • • • , y, • • • , s„), • • • , Xm) to 
h' {u\, ■ ■ ■ ,f{xi,- • • , y, • • • , Xm), ■ ■ ■ , Un), where Xi ’s are distinct variables, h, h' 
are interpreted symbols in T , and Si, Uj ’s are interpreted terms ofT."^ In case 
Vg has many recursive calls to g, say h{si, • • • , g(ti, ■ ■ ■ , tk) ,■ ■ ■ , g{vi, • • • , Vk), 

• • • , Sn), then the definition of f rewrites f{xi, • • • , h{si, • • • , y, • • • , z, • • • , s„), 

• • • , Xm) to hfui, • • • , f{xi,- • • , y • • • , Xm), ■■■, f{xi, • • • , Z, • • • , Xm), ■ ■ ■ , Un) ■ 

The definition of a function / is compatible with a definition of g iff it is 
compatible with g in every argument position. 

As will be shown later, the above requirements on compatibility lead to the 
function symbol / to be distributed over the interpreted terms to have g as an 
argument so that the induction hypothesis(es) can be applied.® 

The above definition is also applicable for capturing the interaction between 
an interpreted function symbol and a T-based function symbol. For example, 
the interpreted symbol + in Presburger arithmetic is compatible with * (in both 
arguments) because of the associativity and commutativity properties of +, which 
are valid formulas in T. 

As stated and illustrated in the introduction, the compatibility property can 
be viewed as requiring that the composition of / with g has a T-based defi- 
nition. Space limitations do not allow us to elaborate on this interpretation of 
compatibility property. 

For ensuring the compatibility property, any lemmas already proved about 
/ can be used along with the definition of /. The requirements for showing 
compatibility can be used to speculate bridge lemmas as well. 

The conjecture (C2) is of depth 2. The above insight can be generalized to 
complex conjectures in which the left side is of arbitrary depth. A conjecture 
in which a composed term of depth d is equated to an interpreted term, can 

^ The requirement on the definition of / can be relaxed by including bridge lemmas 
along with the defining rules of /. 

® In [5], we have given a more abstract treatment of these conditions. The above 
requirement is one way of ensuring conditions in [5] . 
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be decided if all the function symbols from the root to the position p of the 
basic subterm in its left side can be pushed in so that the induction hypothesis 
is applicable. The notion of compatibility of a function definition with another 
function definition is extended to a compatible sequence of definitions of function 
symbols. In a compatible sequence of function symbols (/i, • ’ ’ > /d), each ft is 
compatible with /i+i at ji-th argument, 1 < z < d — 1. 

For example, consider the following conjecture 

(C4) : bton(padO(ntob(m) ) ) = m. 

Functions bton and ntob convert binary representations to decimal repre- 
sentations and vice versa, respectively. The function padO adds a leading binary 
zero to a bit vector. These functions are used to reason about number-theoretic 
properties of parameterized arithmetic circuits [12,10]. Padding of output bit 
vectors of one stage with leading zeros before using them as input to the next 
stage is common in multiplier circuits realized using a tree of carry-save adders. 
An important property that is used while establishing the correctness of such 
circuits is that the padding does not affect the number output by the circuit. The 
underlying decidable theory is the combination of the quantifier-free theories of 
bit vectors with free constructors nil, cons and bO, bl, to stand for binary 0 
and 1, and Presburger arithmetic. 

In the definitions below, bits increase in significance in the list with the first 
element of the list being the least significant. Definitions of bton, ntob, and padO 
are T-based. 

1. bton(nil) — > 0, 

2. btonCcons (bO , yl)) — > bton(yl) + bton(yl) , 

3. btonCcons (bl , yl)) — > s(bton(yl) + bton(yl)), 

4. ntob(O) — > consCbO, nil), 

5. ntob(s(0)) — > consCbl, nil), 

6. ntob(s (s (x2+x2) ) ) — > cons (bO ,ntob(s (x2) ) ) , 

7. ntob(s (s (s( (x2+x2) ) ) ) ) — > cons (bl ,ntob(s (x2) ) ) , 

8. padO (nil) — > cons(b0, nil), 

9. padO (cons (bO , y)) — > cons(b0, padO(y)), 

10. padO (cons (bl , y)) — > cons(bl, padO(y)). 

The function padO is compatible with ntob; bton is compatible with padO as well 
as ntob. However, ntob is not compatible with bton since ntob(s(bton(yl) + 
bton(yl))) cannot be rewritten using the definition of ntob. However, bridge 
lemmas, 

ntob(bton(yl) + bton(yl)) = cons(b0, ntob(bton(yl) ) ) 
ntob(s(bton(yl) + bton(yl))) = cons(bl, ntob(bton(yl) ) ) 

can be identified such that along with these lemmas, ntob is compatible with 
bton. 

A proof attempt of (C4) leads to two basis and two step subgoals based on 
the cover set of ntob. The first basis subgoal where m <- 0, is 
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bton(pad0(ntob(0) ) ) = 0. 

The subterm ntob(O) rewrites using the definition of ntob to consCbO, nil), 
then padO (cons (bO , nil))) rewrites to cons(b0, cons(b0, nil))), and fi- 
nally, btonCpadO (ntob (0))) rewrites to 0 + 0 + 0 + 0, simplifying the above 
equation to a valid formula in Presburger arithmetic. The second basis subgoal 
is similar. 

Consider the first induction step subgoal. The conclusion is 
bton(pad0(ntob(s(s(x2 + x2))))) = s(s(x2 + x2)) 
with the hypothesis being 

bton(pad0(ntob(s(x2) ) ) ) = s(x2). 

The subterm ntob (s (s (x2 + x2) ) ) in the the left side of the conclusion rewrites 
to cons (bO , ntob(s (x2) ) ) by the definition of ntob; the subterm padO (cons (bO , 
ntob(s(x2)))) then rewrites to cons(b0, pad0(ntob(s(x2)))). Term 
bton (padO (ntob (s(s(x2 + x2))))) thus rewrites to bton(pad0(ntob(s(x2)))) 
+ bton (padO (ntob(s (x2) ) ) ) , on which the hypothesis is applicable. The result 
is a valid formula s (x2) + s(x2) = s(s(x2 + x2)) in Presburger arithmetic. 
It can be shown that the second step subgoal also simplifies to a valid formula 
in Presburger arithmetic. 

Every induction subgoal can be decided, and hence (C4) can be decided. 

The reader would have noticed that the compatibility requirement ensures 
that all the function symbols are pushed over interpreted symbols for the induc- 
tion hypothesis to be applicable. 

Note: For understanding the proof below, it would be helpful to concurrently 
consult the proofs of examples (C2) in Section 1 as well as of (C4) above. 

Theorem 7. The validity of a conjecture I = r, where I is a composed term and 
r is interpreted in T , can he decided by the cover set method if the sequence of 
function symbols {fd, fd-i, ■ ■ ■ , f 2 , fi) from the outermost function symbol fd of 
I to the basic subterm fi{xi, • • • , Xm) is compatible. 

Proof. By induction on the depth d of I . 

Basis case (d = 2): Consider a conjecture I = r where I = f 2 {ti, ■■■ , fi{xi, • • • , 
Xm)r ■ ■ ,tk) and (/ 2 , /i) form a compatible sequence (i.e., /2 is compatible with 
fi in its argument position), each tj, 1 < j < k, and r are interpreted terms in 
T. Recall that any induction variable Xi appearing in an inductive position of 
fi does not occur in any tj. 

The cover set method uses the induction scheme generated from the cover 
set associated with the definition of fi . 

Consider a basis subgoal adl) = generated from a rule h — > ri in 

the definition of /i, where ri does not have any recursive calls to fi. Since 
o-c{fi{xi, - ■ ■ ,Xjn)) = h, (Xc{l) rewrites to f 2 {(Xc{ti), - ■ ■ ,ri, - ■ ■ ,ac{tk))- Since U 
does not include any induction variable, adti) = U, implying /2(i7c(ti), • • • , ri, 

■ ■ ■ ,o'c(tk)) = / 2 (ti, • • • , ri, • • • , tfc). Because of compatibility of /2 with /i. 
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/2(ii, • • • , ri, • • • , ifc) rewrites to an interpreted term. The basis subgoal there- 
fore simplifies to a formula in T. 

Consider an induction step subgoal generated from a rule I2 — > r2 where 
i~2 = hi{si, ■ ■ ■ , fi{vi, ■ ■ ■ , Vm), ■ ■ ■ , Sn) with a single recursive call to /i (for sim- 
plicity). Let the conclusion be (Jc{f2{ti, ■ ■ ■ , • • • , Xm) • • • , tfc)) = <Xc{x) and 

the induction hypothesis be 6i{f2{t\, • • • , , /i(a;i, • • • , Xm), ■ ■ ■ , tk)) = 9 i{r), where 
o-c{fi{xi, - ■ ■ ,Xm)) = I2 and Oi{fi{xi, ■ ■ ■ , Xm)) = - ■ ■ ,Vjn)- The leftside of 

the conclusion rewrites to /2(ti, • • • , r2, • • • , tfc) (just as in the basis case). As per 
definition of compatibility of /2 with fi, f2{yi, ■ ■ ■ 1 hi{si, ■ ■ ■ ,y, ■ ■ ■ , Sn), ■ ’ ' .Uk) 
rewrites to h2{s[, • • • , /2(j/i, ■ ■ ■ ,y,‘ ■ ■ , yk), ■ ■ ■ , s(i) where yi and y are distinct 
variables, and ft-2, Sj’s and s'-’s are in T. This means that the left side of the 
simplified conclusion f2{ti, ■ ■ ■ ,X2, ■ ■ ■ ,tk) rewrites by the same sequence of rules 
to h2{S{s[), - ■ ■ ,S{f2{yi, - ■ ■ ,y, - ■ ■ ,yk)), - ■ ■ where S{yi) = U,1 < i < k, 

and S{y) = fi{vi,- ■ ■ , Vm)}- The hypothesis applies since = tj for all tj (re- 
call that there are no XiS in tj’s), and 9 i{fi{xi, • • • , Xm)) = - , Vm), which 

simplifies the conclusion to h2{S{s[), • • • , Oi{r), ■ ■ ■ , < 5 (s(j)) = CTc(r), a formula in 

r. 

The above proof step assumed a single recursive call in V2 and the application 
of a single induction hypothesis. The proof generalizes when there are multiple 
recursive calls in V2 and many possibly different hypotheses have to be applied. 

Induction Step case: Assume that the statement of the theorem for all conjectures 
k = r' , where I' is a composed term of depth d' < d, and r' is interpreted. 

The main idea in this proof is to use the fact that a conjecture I = r in 
which the composed term I = fd{ti, - ■ ■ , Id-i, ■■■ ,tk) is of depth d, uses the same 
induction scheme as a related conjecture Id-i = c, where Id-i is a composed term 
of depth d— 1 , and c is an interpreted term. By the induction hypothesis, Id-i = c 
can be decided since all subgoals, including basis and induction steps, can be 
decided. Because of the compatibility of fd with fd-i, the outermost symbol of 
Id-i, it can be shown that each subgoal of ^ = r using the same induction scheme 
can also be decided. In the basis step, the instantiated conjecture rewrites to a 
formula in T, and in the induction step, fd can be pushed over the interpreted 
symbols to surround fd-i so that the hypothesis is again applicable, resulting 
in a formula in T. More details follow. 

The same basic term fi{xi, • • • , Xk) in Id-i used for generating an induction 
scheme for Id-i = c is also used for generating an induction scheme for I = r. 
By the induction hypothesis, each of the subgoals generated from Id-i = c 
using this induction scheme can be decided in T. For I = r as well, a subgoal 
(Jc{fd{ti, • • • , Id-i, ■ ■ ■ , tk)) = <Xc(r) using the same substitution can be decided. 

Consider a basis subgoal adld-i) = ctc{c) where adld-i) simplifies to the 
interpreted term u through a sequence of rewrite steps using the definitions of 
the T-based function symbols in Id-i- (The T-based function symbols in Id-i 
are successively eliminated in a bottom up fashion starting with /i until finally 
fd-i rewrites to u by a rule of the form /d-i(- ••, r^, •••)—!■ u in the definition of 
fd-i-) By the compatibility of fd with fd-i, fdih, ■ ■ ■ ,u, ■ ■ ■ ,tk) rewrites to an 
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interpreted term, say u', implying that the basis subgoal adl) = CTc(r) simplifies 
to u' = CTc(r), a formula in T. 

Consider an induction step subgoal with the conclusion (Jc(ld-i) = o’c(c) 
and the hypothesis 6i{ld-i) = 0i(c), generated from a rule in the definition 
of /i whose right side has a single recursive call to fi (for simplicity). The 
left side of the conclusion simplifies through a sequence of rewrite steps us- 
ing the definitions of the T-based function symbols in Id-i to a term of the 
form hd-i{s'i, ■ ■ ■ , 9i{ld-i), ■ ■ ■ s'„) where s' s and hd-i are interpreted. The T- 
based functions in Id-i are successively pushed over interpreted function sym- 
bols in a bottom up fashion until finally fd-i is pushed using a rule of the form 
fd-i{yi,hd- 2 {si, - ■ ■ ,y, - ■ ■ ,Sn), - ■ ■ ,Vk) hd-i{s[,- ■ ■ , fd-i{yi,- ■ ■ ,x,- ■ ■ ,yk), 

• • • , s(j), to get the left side of the hypothesis. By compatibility of fd with fd-i, 
fd(zi,- ■ ■ , hd-i{s[,- • • , z, • • • , s'„) ■■■ ,Zk) rewrites to hd{s'{, • • • , /d(zi, • • • , z, • • • , 
Zfc), • • • , s"). where hd and s"j's are interpreted. This implies that in the corre- 
sponding induction step subgoal, the left side of the conclusion adfditi, • • • , Id-i, 

• • • , tk)) will simplify to hd{s'{ • • • , Oi{fd{ti,- ■ ■ Id-i,- ■ ■ , tfc)), • • • , s"), a term con- 
taining the left side of the hypothesis. The application of the hypothesis simplifies 
the conclusion to hdis” • • • , 9i{r), • • • , s") = CTc(r), a formula in T. □ 



5 Relaxing Linearity Reqnirement: Nonlinear 
Conjectures 

To cover a larger class of formulas, we discuss conditions for deciding a conjecture 
with multiple occurrences of induction variables in its left side. 

Definition 8. A conjecture /(si , • • • , Sm) = f, where f{s \ , • • • , Sm) is a T -based 
term, r is interpreted in T, and for 1 < i < m, either Si is interpreted in T , or 
Si = gi{x\, • • • , Xn) is a basic term, is called basic nonlinear if some variable has 
multiple occurrences in 1. 

In a basic nonlinear conjecture, induction variables (as well as noninduction 
variables) appearing as arguments in basic terms can be shared. For example, 
the conjecture below is basic nonlinear, 

(C5) : append (blast (m) , last(m)) = m, 

where last returns the singleton list containing the last element of a list, and 
blast returns the input list without the last element. 



1. 


last (cons (x, nil)) 


— > cons(x, 


nil) . 


2. 


last(cons(x, cons(y, 


z))) — > last (cons (y, z)), 


3. 


blast (cons (x,nil) ) 


— > nil, 





4. blast (cons (x, cons (y, z) ) ) — > cons (x, blast (cons (y,z) )) . 



To decide such a conjecture, additional conditions become necessary. First, 
since there can be many induction schemes possible, one each generated from 
the cover set of a basic term, it is required that they can be merged into a 
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single induction scheme [2,9] (the case when each cover set generates the same 
induction scheme trivially satisfies this requirement). The second requirement is 
similar to that of compatibility: / above must be simultaneously compatible with 
each gi. In the definition below, we assume, for simplicity, that there is at most 
one recursive call in the function definitions. 

Definition 9. The definition of f is simultaneously compatible in T with the 
definitions of g and h in its and arguments, where if^j if for each right 
side rg and r^ of the rules in the definitions of g and h, respectively: 

1. whenever Vg and Vh are interpreted in T, f{xi, - ■ ■ ,Vg, - ■ ■ ,rh, - ■ ■ ,Xm) 
rewrites to an interpreted term in T , and 

2. whenever Vg = hi{- • ■ , g(- ■ ■), ■ ■ •) and = h 2 {- ■ ■ ,h{- ■ •),■ ■ •), the definition 
of f rewrites f{xi, - ■■ ,hi{- ■■ ,x, - ■■),■■■ ,h 2 {- ■■ ,y, - ■■),■■■ ,Xm) to 

hsi- ■ ■ ,f{xi, - ■ ■ ,x, - ■ - ,y, - ■ ■ ,x^), - ■ ■). 

For example, append is simultaneously compatible with blast in its first 
argument and last in its second argument. 

Theorem 10. A basic nonlinear conjecture /(•••, g{xi, • • • , a;„), • • • , h{x\, • • • , 
Xn),-‘‘) = such that xi,---,Xn do not appear elsewhere in the left side of 
the conjecture and the remaining arguments of f are interpreted terms, can be 
decided by the cover set method if f is simultaneously compatible with g and h 
at z*^ and arguments, respectively, and the induction schemes suggested by 
g{xi, • • • , Xn) and h{x\, • • • , Xn) can be merged. 

The proof is omitted due to lack of space; it is similar to the proof of the 
basis case of Theorem 7. The main steps are illustrated using a proof of (C5). 

The induction schemes suggested by the basic terms blast (m) , last (m) in 
(C5) are identical. There is one basis subgoal and one induction step subgoal. 
The basis subgoal obtained by m <- cons(x, nil), 

append (blast (cons (x, nil)), last (cons (x, nil))) = cons(x, nil), 

simplifies to a valid formula by the definitions of blast and last, and then by 
the definition of append. 

In the step subgoal, the induction conclusion is 

append (blast (cons (x , cons (y ,z) ) ) , last (cons (x, cons (y , z) ) ) ) = cons (x, cons (y,z) ) , 

with the hypothesis being, 

append (blast (cons (y, z)), last(cons(y, z))) = cons(y, z) . 

The left side of the conclusion rewrites by the definitions of last , blast to 
append (cons (x, blast (cons (y, z))), 

last (cons (y, z)))) which rewrites using the definition of append to cons(x, 
append (blast (cons (y, z)), last (cons (y, z) ))) to which the hypothesis ap- 
plies, leading to the valid formula cons (x, cons(y, z)) = cons(x, cons(y, 
z) ) . 




Extending Decision Procedures with Induction Schemes 



341 



The notion of simultaneous compatibility and the above theorem generalize 
to complex nonlinear conjectures, similar to the complex conjecture (C4) dis- 
cussed in Section 4, in which a conjecture includes a sequence of simultaneously 
compatible function symbols. Because of space limitations, we cannot discuss 
this in detail here. The example below illustrates the idea to some extent. The 
underlying theory is that of free constructors with 0, s. The function symbol + 
is assumed to have the usual recursive definition: 0 + y — > y, s(x) + y — > 
s(x + y). The equation is: 

(C6) : mod2(x) + (half (x) + half(x)) = x, 

is a complex nonlinear conjecture with the following definitions of half and 
mod2. 



1. half(O) — > 0, 

2. half(s(0)) — > 0, 

3. half (s (s (x) ) ) — > s(half(x)). 

4. mod2(0) — > 0, 

5. mod2(s(0)) — > s(0), 

6. mod2 (s (s (x) ) ) — > mod2(x). 

For + to be compatible with half in both its arguments, an intermediate lemma 
(either the commutativity of + or x + s(y) = s(x + y)) is needed as well.® 

It can be a priori determined that (C6) can be decided by the cover set 
method since the basic terms half(x), mod2(x) suggest the same induction 
scheme, and the function symbol + is simultaneously compatible with mod2 , + 
as well as half) in the presence of the above lemma about +. 



6 Bootstrapping 

As discussed above, simple and complex conjectures with T-based function sym- 
bols can be decided using the cover set method, giving an extended decision pro- 
cedure and an extended decidable theory. In this section, we outline preliminary 
ideas for bootstrapping this extended decidable theory with the definitions of 
T-based function symbols and the associated induction schemes, to define and 
decide a larger class of conjectures. 

Definition 11. A definition of a function symbol f is extended T-hased for a 
decidable theory T if for each rule, f(ti, ■ ■ -tm) ^ r in the definition, where t[s 
are interpreted over T , the only recursive call to f in r, if any, has only T-based 
terms as arguments, and the abstraction of r after replacing the recursive call 
to f by a variable, is either an interpreted term over T , or a basic term g{- • •) 
where g has an (extended) T-based definition. 

® If + is defined by recursing on the second argument, even then commutativity of + 
or s(x) + y = s(x + y) is needed. 




342 Deepak Kapur and Mahadavan Subramaniam 



For example, exp denoting exponentiation, defined below, is extended T- 
based over Presburger arithmetic. For rules defining *, please refer to the begin- 
ning of Section 3. 

1. exp(x, 0) — > s(0), 

2. exp(x, s(y)) — > x * exp(x, y) . 

Unlike simple conjectures, an inductive proof attempt of an extended T-based 
conjecture may involve multiple applications of the cover set method. Induction 
may be required to decide the validity of the induction subgoals. In order to 
determine a priori this, the number of recursive calls in any rule in an extended 
T-based definition, is restricted to be at most one. The abstracted right side r 
could be an interpreted term in T, or a basic term with an extended T-based 
function. 

Theorem 12. A simple extended T -based conjecture = r, where 

f is an extended T-based function, and r is interpreted over T, can be decided 
by the cover set method. 

The key ideas are suggested in the disproof of an illustrative conjecture about 
exp: 



(C7) : exp(s(0), m) = s(m). 

In the proof attempt of (C7), with induction variable m, there is one basis and 
one step subgoal. The basis subgoal, 

exp(s(0), 0) = s(0) 

rewrites by definition of exp to the valid formula s(0) = s(0). In the step 
subgoal, the conclusion 

exp(s(0), s(y)) = s(s(y)), 

rewrites by definition of exp to s(0) * exp(s(0), y) = s(s(y)), to which 

the hypothesis, exp (s(0) , y) = s (y) , applies to give s (0) * s(y) = s(s(y)), 
which then rewrites by definition of*tos(0) * y= s(y),a simple conjecture 
which can be decided to be false by the cover set method. 

Complex extended T-based conjectures can be similarly defined, and condi- 
tions for deciding their validity can be developed. This is currently being ex- 
plored. 



7 Conclusion 

This paper describes how inductive proof techniques implemented in existing 
theorem provers, such as RRL, can be used to decide a subclass of equational 
conjectures. Sufficient conditions for such automation are identified based on the 
structure of the conjectures and the definitions of the function symbols appearing 
in the conjectures as well as interaction among the function definitions. 
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The basic idea is that if the conditions are met, the induction subgoals au- 
tomatically generated a conjecture by the cover set method simplify to formulas 
in a decidable theory. This is first shown for simple conjectures with a single 
function symbol recursively defined using interpreted terms in a decidable the- 
ory. Subsequently, this is extended to complex conjectures with nested function 
symbols by defining the notion of compatibility among their definitions. The 
compatibility property ensures that in induction subgoals, function symbols can 
be pushed inside the instantiated conjectures using definitions and bridge lem- 
mas, so as to enable the application of the induction hypotheses, leading to 
decidable subgoals. 

It is shown that certain nonlinear conjectures with multiple occurrences of 
induction variables can also be decided by extending the notion of compatibility 
to that of simultaneous compatibility of a function symbol to many function 
symbols. Some preliminary ideas on bootstrapping the proposed approach are 
discussed by considering conjectures with function symbols that are defined in 
terms of other recursively defined function symbols. 

Our preliminary experience regarding the effectiveness of the proposed con- 
ditions is encouraging. Several examples about properties of lists and numbers 
as well as properties used to establish the number-theoretic correctness of arith- 
metic circuits have been successfully tried. 

Some representative conjectures, both valid and nonvalid formulas, decided 
by the proposed approach are given below. With each conjecture, the annota- 
tions indicate whether it is simple or complex, as discussed above, its validity 
and the underlying decidable subtheories. Conjectures are annotated as being 
nonlinear if they contain multiple basic terms with the same induction variables. 
For example, the conjectures 12-16 below are nonlinear since they have multiple 
basic terms with the induction variable x. However, conjectures are 7-9 are not 
nonlinear since they do not contain multiple basic terms even though they con- 
tain multiple occurrences of the variable x. In conjectures 18-20, the underlying 
theory is Presburger arithmetic extended with the function symbol *. 

Conjectures 16 and 17 establish the correctness of a restricted form of ripple- 
carry and carry-save adders respectively. The arguments to the two adders are 
restricted to be the same in these conjectures. This restriction can be relaxed, 
and the number-theoretic correctness of parameterized ripple-carry and carry- 
save adders[12,10] can be done using the proposed approach. In addition, several 
intermediate lemmas involved in the proof of multiplier circuits and the SRT 



divider circuit [10,11] can be handled. 
1. half (double (x) ) = x. 


[Complex, valid, Presburger] 


2. mod2 (double (x) ) 


= 0, 


[Complex, valid, Presburger] 


3. half (mod2 (x) ) 


= 0, 


[Complex, valid, Presburger] 


4. log(mod2 (x) ) 


= 0, 


[Complex, valid, Presburger] 


5. exp2(log(x)) 


= X, 


[Complex, inval, Presburger] 


6. log(exp2(x)) 


= X, 


[Complex, valid, Presburger] 


7.x* log(mod2 (x) ) ) 


= 0, 


[Complex, valid, Presburger] 


8.x* mod2 (double (x)) 


= 0, 


[Complex, valid, Presburger] 


9. memb(x, delete (x, y)) 


= false, 


[Complex, valid, lists] 
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10. btonCpadO (ntob(x) ) ) = x, [Complex, valid, lists] 

11. last (ntob (double (x) ) ) = 0, [Complex, valid, lists] 

12. length(append(x, y) ) - (length(x) + length(y)) = 0, 

[Complex, valid, nonlinear, Presburger, Lists] 

13. rotate (length(x) , x) = x, 

[Complex , valid , nonlinear , Presburger , List s] 

14. length(nth(x, y)) <= length(x), 

[Complex , valid , nonlinear , Presburger , List s] 

15. length(delete(x, y) ) <= length(y), 

[Complex , valid , nonlinear , Presburger , List s] 

16. btonCcarry-saveadder (ntob(x) , ntob(x), ntob(x))) = x + x + x, 

[Complex, valid, nonlinear, Presburger, Bitvectors] 

17. btonCripple-carryadder (ntob(x) , ntob(x) , ntob(x)) = x + x + x. 

[Complex, valid, nonlinear, Presburger, Bitvectors] 

18. expd, x) = X, 

[Simple, valid, Presburger extend by *] 

19. expd, x) = s(x), 

[Simple, inval, Presburger extend by *] 

20. exp(x, mod2 (double (y) ) ) = s(0) 

[Complex, valid, Presburger extend by *] 

Inductive reasoning plays a central role in several nontrivial applications, 
but induction techniques are hardly supported in many reasoning tools, primar- 
ily due to the intense manual intervention required to perform inductive proofs in 
general. The proposed approach can be used to integrate induction proof meth- 
ods in other reasoning tools and selectively invoke these methods to significantly 
enhance the reasoning capabilities of these tools without compromising automa- 
tion. For instance, procedures implementing the cover set induction method can 
be integrated as a component decision procedure in a cooperating decision pro- 
cedures framework; it can be invoked to check the validity of inductive subgoals. 

To make the proposed approach more effective, it should be generalized to 
decide more general quantifier-free formulas as well as mechanically generate 
subsidiary conditions under which a given quantifier-free formula is valid. Such 
an investigation has been initiated and preliminary results are discussed in [5] . It 
is also necessary to consider decidability of formulas that require nested induc- 
tion. Another promising direction for extending this work is to use the proposed 
approach to guide generation of intermediate lemmas. 



Acknowledgements: Thanks to Jurgen Giesl and the referees for useful com- 
ments on an earlier draft of the paper. 
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Abstract. Although theoretically it is very powerful, the semantic path 
ordering (SPO) is not so useful in practice, since its monotonicity has to 
be proved by hand for each concrete term rewrite system (TRS). 

In this paper we present a monotonic variation of SPO, called MSPO. It 
characterizes termination, i.e., a TRS is terminating if and only if its rules 
are included in some MSPO. Hence MSPO is a complete termination 
method. 

On the practical side, it can be easily automated using as ingredients 
standard interpretations and general-purpose orderings hke RPO. This 
is shown to be a sufficiently powerful way to handle several non-trivial 
examples and to obtain methods hke dummy elimination or dependency 
pairs (without the dependency graph refinement) as particular cases. 
Finally, we obtain some positive modularity results for termination based 
on MSPO. 



1 Introduction 

Rewrite systems are sets of rules (directed equations) used to compute by re- 
peatedly replacing parts of a given formula with equal ones until the simplest 
possible form is obtained. Depending on the kind of objects that are rewritten 
there are different kinds of rewrite systems, like string rewrite systems (Thue or 
semi-Thue systems) or term rewrite systems (TRS; see [DJ90,Klo92,BN98] for 
detailed surveys). 

Termination is a fundamental property for most applications of rewrite sys- 
tems. Termination of a TRS is, in general, an undecidable property, even for one 
rule TRSs. Termination of TRSs can be proved by showing that the induced 
rewrite relation is included in a well-founded ordering on terms. If the ordering 
V is also monotonic and stable under substitutions, i.e., a reduetion ordering, 
then it suffices to check that I y r for every rule 1 r in the system. 

* partially supported by the CICYT project HEMOSS ref. T1C98-0949-C02-01. 

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 346-364, 2000. 

Springer- Verlag Berlin Heidelberg 2000 
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Monotonic orderings including the subterm relation are called simplification 
orderings, and their well-foundedness follows from Kruskal’s theorem [KruGO]. 
Inside this class, path orderings, and in particular the recursive path ordering 
(RPO) [Der82], have received a special attention (see [Der87,DJ90,BN98]). Un- 
fortunately, although these orderings are simple and easy to use, they turn out, 
in many cases, to be a weak termination proving tool, as there are many TRSs 
that are terminating but are not contained in any simplification ordering, i.e. 
they are not simply terminating. 

To avoid this problem, many different transformation methods have been de- 
veloped, e.g. [BD86,BL90,Zan94,FZ95,Ste95,Xi98,KNT99]. By transforming the 
TRS into a set of ordering constraints, the dependency pair method [AG97,AG00] 
has become a successful general technique for proving termination of (non-simply 
terminating) TRSs. 

As an alternative to transformation methods, more powerful term orderings 
can be used. Due to its simplicity, the Semantic Path Ordering (SPO) ([KL80]) 
becomes a potential well-known candidate: in SPO the scheme of RPO is gen- 
eralized by replacing the precedence on function symbols by any (well-founded) 
underlying (quasi-)ordering involving the whole term and not only its head sym- 
bol. Although the simplicity of the presentation is kept, this makes the ordering 
much more powerful. In fact, for every terminating TRS there is some SPO that 
includes its rewrite relation. 

Unfortunately, SPO is not so useful in practice. Due to the generalization, 
the monotonicity property is lost, even if the underlying ordering is monotonic. 
Hence, it is not sufficient to check that the rules are included in the ordering 
to ensure termination, since this does not imply that each rewrite step is inside 
the ordering. In order to ensure termination of a TRS R, the user is responsible 
for proving (by hand) monotonicity restricted to all terms s and t such that s 
rewrites to t with R in one step. 

In this paper we present a monotonic version of the SPO. On the one hand, 
this monotonic semantic path ordering (MSPO) is still very powerful for proving 
termination of TRSs: on the theoretical side, it also characterizes termination, 
i.e. a TRS is terminating if and only if the rules are included in some MSPO; 
and on the practical side, it generalizes most of the (automatable) termination 
proof methods. On the other hand, termination can be automatically checked 
once the ingredients, i.e. the underlying (base) quasi-orderings, of the MSPO are 
provided. 

The first and only other, as far as we know, monotonic version of SPO is 
due to Geser [Ges92]. On the one hand, this proposal is not as general as ours 
(in fact, as we will show, it does not characterize termination), and on the other 
hand, it is less suitable for practical implementations. In section 5.3 a detailed 
comparison with this work is provided. 

We are not only interested in using MSPO for checking termination, but 
also for proving termination automatically. Since we cannot expect to automat- 
ically generate an adequate MSPO for a given TRS whenever it exists, we have 
studied particular classes of underlying quasi-orderings, which can be automat- 
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ically generated. As a hint of the power of the resulting family of MSPO’s, it is 
shown that some known methods like dummy elimination [FZ95] or dependency 
pairs [AG97,AG00] (without the dependency graph rehnement) are particular 
instances. 

Using these classes of underlying quasi-orderings the termination proofs of 
several non-simply terminating, but terminating, term rewriting systems can be 
fully automated. However, some heuristics for choosing the underlying quasi- 
ordering have to be developed in order to make the method more effective in 
practice. On the other hand, due to its generality and its additional flexibility, 
MSPO provides, in some cases, simpler termination proofs than the dependency 
pair method (see example 6) . A hrst system based on MSPO has been developed 
by which many examples, including the ones in the paper, have been checked. 
The software and examples are available at www.lsi.upc.es/~albert. 

Additionally, applying known abstract sufhcient conditions ensuring modu- 
larity of termination [Ohl94,Gra94], modularity results for termination based on 
MSPO are obtained. In particular, for TRSs proved terminating using MSPO 
with the aforementioned classes of underlying quasi-orderings, termination is 
proved to be modular for disjoint systems and for hnite constructor-sharing 
systems. Note that these modularity properties are crucial for many practical 
applications of automatic termination proof systems. 

Formal dehnitions and basic tools are introduced in section 2. In section 3 we 
present and study the monotonic semantic path ordering. Section 4 is devoted 
to examples. Other termination methods are analyzed in section 5. Section 6 
presents an ordering constraint solving approach for proving termination using 
MSPO. In section 7 modularity results are presented. Some conclusions are given 
in section 8. 



2 Preliminaries 

In the following we consider that fF is a set of function symbols, X a set of 
variables and T{T, ff) is the set of terms built from T and X . Let s and t be 
arbitrary terms in T{tF,X), let / be a function symbol in X and let cr be a 
substitution. A (strict partial) ordering is a transitive irreflexive relation. It is 
monotonic if s f implies y and stable under substitution 

if s f implies scr y ter. Monotonic orderings that are stable under substitutions 
are called rewrite orderings. A reduction ordering is a rewrite ordering that is 
well-founded: there are no inhnite sequences ti y t '2 y ■ ■ ■ 

The reflexive-transitive closure of a binary relation -y is denoted by -y and 
the transitive closure by 

A term rewrite system (TRS) is a (possibly inhnite) set of rules f r where 
I and r are terms. Given a TRS R, s rewrites to t with R, denoted by s -y^ t, 
if there is some rule I -y r in R, s|p — Icr for some position p and substitution 
(7 and t — s[r(j]p. A TRS R is terminating if there exists no inhnite sequence 

U ~^R I 2 ~^R ■ ■ ■ Thus, the transitive closure of any terminating TRS is a 
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reduction ordering. Furthermore, reduction orderings characterize termination 
of TRSs. 

Theorem 1. A rewrite system R is terminating if and only if all rules are 
eontained in a reduetion ordering y, i.e., I y r for every I ^ r ^ R. 

Another interesting property of reduction orderings is that they can be com- 
bined with the subterm relation > without loosing well-foundedness. However, 
note that in general, monotonicity will be lost, since we only add the subterm 
relation, and not its monotonic closure. 

Proposition 1. If y is a reduetion ordering then y U> is well-founded. 

Given a relation y, the multiset extension of on finite multisets, denoted 
by is defined as the smallest transitive relation containing 

A U {s} A U {H, . . . , tn} if s b- for all f £ {1 . . . n} 

If is a well-founded ordering on terms then is a well-founded ordering on 
finite multisets of terms. 

A quasi-ordering A is a transitive and reflexive binary relation. Its inverse is 
denoted by A. Its striet part y is the strict ordering ^ \ A (i.e, s y t iff s yt 
and s ^t). Its equivalenee ~ is A fl A. Note that A is the disjoint union of 
and ~, and that if = denotes syntactic equality then U = is a quasi-ordering 
whose strict part is 

Notation: In the remainder of this paper, A (possibly with subscripts) will 
always denote a quasi-ordering. 

The following definitions for A will be used: 

1. A is monotonie if is. 

2. A is well-founded if is. 

3. A is stable under substitutions if is and scr y ter whenever s yt. 

4. A is quasi-monotonie if /(..., s, .. .) A /(..., f, .. .) whenever s yt. 

5. A is a quasi-reduetion quasi-ordering if it fulfills the above properties 2, 3 
and 4. 

Note that if A is quasi-monotonic then is not necessarily monotonic. 

The lexieoqraphie eombination of Ai, . . . , A„, denoted by (Ai, . . . , A„);ea:, is 
defined as usual as, s(Ni, . . . , yn)iexl iff either s y{ t for some i and s yj t for 
all j < i, or s yi t for all i. 

If all Aj- are well-founded (stable under substitutions) then its lexicographic 
combination also is. 

A preeedenee y^r is a well-founded quasi-ordering on T . 

Given a precedence G;r, the reeursive path ordering (RPO), denoted by yrpo, 
is defined recursively as follows: 

S — f (si , , SfYi ) y rpo I iff 

1. Si yrpo t, for some i — 1, . . . ,m, or 

2. t — gifi, . . . An) with / y^r g and s yrpo li for all f = 1 . . . n or 

Z.t-g{ti,...,tn)^'R\ifyjrg and {si 

where ^rpo is defined as )^rpo U 
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3 A Monotonic Semantic Path Ordering 

Let us now recall the definition of the semantic path ordering. Then we will 
analyze an example of non-monotonicity, which will provide the intuition behind 
the monotonic version we will propose. 

Definition!. Given Gq, called the underlying (or base) quasi-ordering, the 
semantic path ordering (SPO) [KL80], denoted as is defined as 

s — f{si, , Sm ) y-^pQ I iff 

1- Si >1%0 f for some i — 1, . . . ,m, or 

2. s yq t — g{ti, . . . ,tn) and s yfp,^ti for all i — 1, . . .,n, or 

3. ^ I — g(fl}...fin') and Sf^i } yp^poifl} . . . ? 

where y%o defined as yfp„ U =. 

The semantic path ordering is well-defined, which can be easily proved by 
induction on sum of the sizes of s and t, and fulfills the following property. 

Lemma 1 ([KL80]). Ifyq is well-founded and stable under substitutions then 
yfpo is a well-founded ordering stable under substitutions. 

But, as said before, the semantic path ordering is, in general, non-monotonic, 
even when is quasi-monotonic (in fact, the same problem appears if is 
monotonic). This is shown in the following example. 

Example 1. Consider the following quasi-ordering Gq defined for all terms s and 
t as: (i) f{s) Gq g{ty, (ii) f{s) Gq f{t) iff s Gq t; and (hi) g{s) Gq g{t), and 
reflexive in variables and constants. 

This quasi-ordering is well-founded, since its strict part is «[/(»)] Gq M[<7(t)] 
for all non-empty contexts u containing only the symbol / and for all terms s 
and t (note that the length of any decreasing sequence is at most the number of 
/’s above the first g symbol of the initial term in the sequence) ; it is stable under 
substitutions and quasi-monotonic. However, the induced SPO is not monotonic: 
by case 1 of SPO we have g{f{a)) yfp„ f{a), but by adding the context /([]) 
onto both terms we have f{g{f{a))) f{f{a}} ( in fact, in this case we can 

even prove /(/(a)) f{g{,f{a}}}}. 

Analyzing this example we can observe that, even if is quasi-monotonic 
(or monotonic), since in case 1 of SPO we do not require s Gq t, it may happen 
that by adding some context u, if we do not have w[s] Gq w[f], we cannot apply 
any of the cases. Then, in order to ensure monotonicity of yfp„, we need to be 
sure that w[s] Gq w[f] for any context. A way to obtain this is to require always 
s Gq t, provided that is quasi-monotonic, that is defining a new ordering 
yM as s yM I iff and only if s yq t and s yfp„ t. 

However, requiring always s yq t can be a bit too strong and, in fact, from 
the example above we only need to have w[s] yq w[f] for every (non-empty) 
context u. Hence instead of requiring s yq t, we will ask for something weaker 
that ensures w[s] yq w[f]. 
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Definition 2. We say that is quasi-monotonic on yq ( or yq is quasi- 
monotonic wrt. yj ) if 

s yj t implies f{...s...) yq /(. . . t . . .) 

for all terms s and t and funetion symbols f. 

Definition 3. A pair is ealled a quasi-reduction pair if yj is quasi- 

monotonie, yq is well-founded, both are stable under substitutions and yj is 
quasi-monotonie on yq. 

Now we define the monotonie semantie path ordering (MSPO): 

Definition 4. Let (^j, yq) be a quasi-reduetion pair. The eorresponding mono- 
tonie semantie path ordering, denoted by ymspo, Is defined as: 

s ymspo t if and only if s yp t and s yfp^, t 

for all terms s and t. 

Theorem 2. ymspo is a reduetion ordering. 

Proof. Well-foundedness follows from the fact that ymspo C yfp„ and yfp„ is 
well-founded. Transitivity and stability under substitutions follow respectively 
from the transitivity and the stability of and yfp„. 

For monotonicity we have to show that s ymspo I implies s, .. .) ymspo 

f{...,t,...}, that is f{...,s,...} yj f,...) and also f{...,s,...} yg„ 

f(...,t,...), for all terms s and t and function symbols f. 

By definition of ymspo, s ymspo I implies s yp t. Hence, by the quasi- 
monotonicity of yp, we have s, .. .) ypf[...,t,...). On the other hand, 

by quasi-monotonicity of yp on yq, we have f{...,s,...) yq /(..., f, .. .). 
Then, by definition of ymspo, s ymspo I implies s yfp„ t and, therefore we 
have . .s . . .} . .t . . which implies /(..., s, .. .) yfp„ f{...,t,...) by 

case (3). □ 

The previous theorem shows that ymspo provides a correct method for prov- 
ing termination of term rewriting systems. The following result shows that it 
is also a complete method, i.e., for any terminating term rewriting there is a 
monotonic semantic path ordering that includes its rules. Therefore ymspo char- 
acterizes termination. The proof is very similar to the completeness proof of 
SPO and also to the completeness proof of other methods like semantic labeling 
in [MHH96]. 

Theorem 3. A rewrite system R is terminating if and only if there exists some 
quasi-reduetion pair {yp, yq), s.t. I ymspo r for every rule I ^ r in R. 

Proof. The right to left implication follows from Theorem 2. For the left to right, 
we will build an appropriate pair {Pp, Pq) for any terminating TRS R. 
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Let be — 7-|j. and let be {-^r U>)*. By definition of rewriting 
is quasi-monotonic and both and are stable under substitutions. By 
termination of R and Property 1, the strict part of i.e. {-^r is well- 

founded. Finally, since C and is quasi-monotonic, we have that 
is quasi-monotonic on 

Therefore, since (^j, ^q) is a quasi-reduction pair, we only have to prove that 
^ ymspo r for every rule I ^ r in R. By dehnition, we have I r and I yq r, 
and, moreover, for every subterm r' of r we have I yq r' , since I -y-R r \> r' . 
Therefore, we have I y^ r and I yfp„ r, by repeatedly applying case 2, which 
implies I ymspo r. □ 

The quasi-reduction pair condition is quite tight: as shown in the following 
example, if yq is required to be a quasi-reduction quasi-ordering then even SPO 
does not characterize termination, i.e. there are terminating sets of rules that 
are not included in any SPO with an underlying quasi-reduction quasi-ordering. 

Example 2. There is no quasi-reduction quasi-ordering >^q such that the follow- 
ing TRS is included in the generated SPO. 

a — y h 
f{h) -y g{a) 
h{a} -y h{f{b}} 

First, note that the TRS is terminating (it will be shown using MSPO in ex- 
ample 3). Now we show that these rules cannot be included in SPO with a 
quasi-reduction quasi-ordering yq. To this end it is important to remark that 
in SPO to have c y%o ^ some constant c and term t we need c yq t' for all 
t' subterm of t. 

By dehnition we need a yq b to be able to include the hrst rule. Then 
to include the second one we need f{b) yq g[a) and fib) yq a, in order to 
conclude by case 2 hrst and then by case 2 or 3 for the recursive call with 
f(b) yfpg a, since any other possibility requires b yq a, which contradicts the 
hrst assumption. Now we have (i) a yq b, (ii) f{b) yq g[a) and (hi) fib) yq a. 
We proceed with the third rule. If a y%o h{f{b)) we need a yq f{b), which 
contradicts assumption (hi). Otherwise we need h{a) yq h{f{b)). By assumption 
(hi) and quasi-monotonicity of yq, we have h{f{b)) yq h{a), which implies 
b{(b) )f-Q h{f{b)). Therefore we have to apply case 3 of SPO, and hence we need 
a yfpg f{b), which requires a yq f[b) contradicting assumption (hi). 

Theorem 3 shows the theoretical power of the ordering as a termination proof 
method, but in order to make it more useful in practice in the following two 
sections we will show general methods to obtain quasi-reduction pairs (^j, Nq). 

3.1 Building 

We consider obtained by combining an interpretation I on terms with some 
quasi-reduction quasi-ordering yR, called the basic quasi-ordering, which can 
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be obtained by well-known practical general-purpose methods like the path or- 
derings or polynomial interpretations. For the interpretation /, as a general 
property, we require the preservation of the quasi-monotonicity and stability 
under substitutions of the basic quasi-ordering. Below some particular such in- 
terpretations which are suitable for practical applications are provided. These 
interpretations are not original; they have been used in many different transfor- 
mation based termination methods, and in particular a slightly restricted version 
of them, called argument filtering systems (AFS), has been used in a similar 
way for the dependency pair method [AGOO] . 

We will consider interpretations as mappings from terms to terms I ■.T{tF,X) 
-7- T[T\ X'), although, of course, we can also consider interpretation from terms 
to multisets of terms or any other domain provided that the required properties 
are fulfilled. 

,;,From now on we consider that Aj is defined as follows 
s y 1 1 if and only if I{s) I(t) 

Note that this does not imply any loss of generality since I can be the identity 
mapping. Additionally, it follows that s yj t if and only if I{s) ys I{t)- 

For any /, if is a quasi-ordering then A j is a quasi-ordering which is well- 
founded if also is. For quasi-monotonicity and stability under substitutions 
this is not always the case. Let us give some examples of interpretations that do 
preserve these properties. 

Each symbol / can be interpreted either by a projection on a single argument, 
denoted by the pair {f{xi , . . . , Xn), xfi, or else by a function symbol /j applied 
to an arbitrary sequence obtained from the arguments of /, denoted by the 
pair if{xi, . . . ,Xn), fi{xi ^, . . .,a?«y)), for some k >0 and ii, . . ,,ik £{!,...,«■}. 
Additionally we consider I to be the identity for variables (although it can be 
any bijection). 

Note that in the AFS the second kind of interpretations are restricted to have 
fi ^ T and all to be different variables in {x\, . . .Xn} , which is not the case 
here. 

We assume that there is only one pair for each symbol. Usually the identity 
pairs will be omitted. Thus the interpretation I is recursively defined from these 
pairs as, I{x) — x and /(/(U, . . .fin)) is 

— liti) if we have the pair {f{xi , . . . , Xn), xfi, or 

- ■ ■■J{fik)) if we have the pair {f{xi, . . . ,Xn), fi{xi^, . ..,XiJ). 

It is easy to show that these interpretations preserve quasi-monotonicity and 
stability under substitutions (recall that the stability of quasi-orderings requires 
also the stability of their strict parts) . 

Proposition 2. Let I be an interpretation as defined above. 

bs quasi-monotonie then Aj is quasi-monotonie. 

2. bs stable under substitutions then A j is stable under substitutions. 
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3.2 Building ^q) 

In this section we show how quasi-reduction pairs can be obtained. 

First, as basic cases, we present two possible quasi-orderings fulhlling the 
quasi-monotonicity requirement for a given 

Proposition 3. Let be a quasi-reduetion quasi-ordering. Then 

1. is a quasi-reduetion pair. 

2. Let be a preeedenee on T and let be defined as s t iff top(s) yj^ 

top(t). Then {yi,'L.T) ® quasi-reduetion pair. 

Now we show how to obtain new quasi-reduction pairs from one or several 
given quasi-reduction pairs. Hence, we can start by pairs as in the proposition 
above and then (repeatedly) apply the following properties to obtain more suit- 
able quasi-reduction pairs. 

First we dehne what a renaming quasi-ordering is. 

Definition 5. Let N be a mapping from T to T , ealled a renaming in T , and 
let fte denote N{f). We extend N to terms, obtaining a head renaming map, in 
the following way N{f{ti, . . . ,tm)) = . . . fim) for every symbol f in T . 

Given y and a renaming map in T , the renaming quasi-ordering y^ is 
defined as s y^ t if and only if either s — t (2 X or N (s) y N (t) . 

Note that the renaming map is only applied to the head symbol of the term, 
and not to the arguments, and that, to preserve stability under substitutions, it 
is not defined for variables. This notion of renaming already appears in the de- 
pendency pair method, where the head symbol of both terms in each dependency 
pair is always renamed (see section 5.2 for details). 

Proposition 4. (^j, yq) is a quasi-reduetion pair if 

1. {yi, yQo) is a quasi-reduetion pair and yq is well-founded and stable under 
substitutions and yq„ C yq; or 

2. {yi, yqo) is a quasi-reduetion pair and yq is y.qo some renaming map 
N in T ; or 

3. {yi,yqfi is a quasi-reduetion pair for all i £ {1 . . .n} and yq is {yq^^ 

, . . . , '2lQn )lex ■ 

Let us show how propositions 3 and 4 can be used to build quasi-reduction 
pairs. Assume that is a quasi-reduction quasi-ordering, y^^ is a precedence 
on IF and N is a renaming map in T . Then, by proposition 3.1, (Nj, y_i) is a 
quasi-reduction pair and, by proposition 4.2, yf) is a quasi-reduction pair 
as well. Now since, by proposition 3.2, (^j, y^) is a quasi-reduction pair, we can 
conclude by proposition 4.3, that (^j, {yf, y.T)iex) is a quasi-reduction pair. 
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4 Examples 

In the examples we will always provide the quasi-reduction pair (^j, yq) and in 
some cases the details of the checking of ymspo will be included. First we give 
the definition of yq using and then the definition of In all cases we use 
the methods described in sections 3.1 and 3.2. Since for the basic quasi-ordering 
yB of we always use RPO, to avoid confusion, its precedence will be denoted 
by (note that we can use other precedences to build yq), and for simplicity 
we will directly give its strict part p . 

Example 3. The following TRS comes from example 2 

a — y h 

f{h) g{a) 
h{a} h{f{b}} 

In this case for yq we use yf^ with the renaming map N which is the identity 
except N{f) — F. For we use the interpretation I generated by the pairs: 
{f{x-), b) and {g{x-), b); and RPO with the precedence hyp F,Fypg,Fypa, 
a yp b. Note that we have added to the signature the symbol F. 

For the first rule we have a yj b since I [a) — a yrpo b — I{b) and since 
N(a) — a and N(b) — b, we have a yq b, which implies a yfp„ b by case 2 of 
SPO, and hence a ymspo b. 

For the second rule /(/(6)) — b — I{g{a)), and hence f{b) yp g{a). To prove 
/(^) >-%o 9(a), since I{N{f{b}}} - F{b} yrpo b - I{N{g{a}}}, which implies 
f[b) yq g{a), by case 2 of SPO we only need to check f[b) yfp„ a, which follows 
again by case 2. 

For the third rule I{h{a)) — h{a) yrpo h{b) — J(/i(/(6))) and hence h{a) yp 
h{f{b)). To prove h{a) yfp^, h{f{b}}, since I{N{h{a}}} - h{a) yrpo h{b) - 
I{N{h{f{b)))), which implies /i(a) yq h{f{b)), by case 2 of SPO we only need to 
check h[a) yfp„ f{b). This follows again by case 2, since I[N — h(a) yrpo 
F[b) — I{N{f{b))), and hence h{a) yq f{b), and h{a) yfp^, b follows as well by 
case 2. 

Example 4 . In the following example of nested recursion (from [FZ95]) we use a 
precedence yp as first component in yq. 

f {h{x}} h{g{x}} 

We can take as yq the lexicographic combination (^Zt with the 

precedence / yp g and / yp h, and the renaming map N which is the identity 
except N{f) — F. For yp, the interpretation is given by {f{x), x) and {h{x), a); 
and the basic quasi-ordering is RPO generated by the empty precedence. Note 
that we have added to the signature the function symbol F and the constant a. 

We only show here that f{g{x)) ymspo 9 if if {x}}}- First, we have figix}} yp 
9ififix}}}, since I{f{g{x))) = g{x) = ligififix)))). To prove figix)) yfp„ 
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we apply case 2, since f g and hence f{g{x)) yq g{f{f{x))). 
For the recursive call f(g(x)) yfp„ f{f{x)) we apply case 2 as well, since 
HN{f{g{x}}}} = F{g{x}} yrpo F{x) = I{N{f{f{x)))) and hence f{g{x)) yq 
g(f(f(x))). For the recursive call f(g(x)) yfp„ f{x) we apply again case 2, since 
I{N{f{g{x)))) — F{g{x)) yrpo F{x) — I{N{f{x))), and for the recursive call 
yfpo ^ we apply twice case 1. 



Example 5. In the following non-simply terminating example (from [AG97]) we 
use a precedence as second component in y_q. 



le{^,y) 

/e(s(a?), 0) 
le[s[x),s[y)) 
minus{0, y) 
minus{s{x) , y) 
if{true,s{x),y) 
if{false,s{x),y) 
quot{0,s{y)) 
quot{s{x),s{y)) 



true 

false 

■ le{x,y) 

0 

■ if{le{s{x),y),s{x),y) 

0 

s{minus{x , y)) 

0 

s{quot{minus{x , y), s{y))) 



We can take as Fq the lexicographic combination (Gj, );ea: with the prece- 
dence le yjr true, le yjr false, minus yjr if and if yjr s. For Gj, the 
interpretation is given by [le{x,y),b), itrue,b), {false, b), {if{x,y,z),y) and 
{minus{x,y),x); and the basic quasi-ordering is RPO generated by the prece- 
dence quot yp s, s yp b and s 0. 

We only show here that minus{s{x),y) ymspo if{F{s{x),y),s{x),y). First 
minus{s{x),y) Gj if{le{s{x),y),s{x),y) holds since I{minus{s{x),y)) — s(a?) = 
I{if{le{s{x),y),s{x),y)). Fo prove minus{s{x),y) yfp„ if{le{s{x),y),s{x),y), 
we have minus{s{x),y) yq if{le{s{x),y),s{x),y), since minus yp if. Then 
by case 2 of SPO we only need to check the recursive call minus{s{x),y) yfp„ 
le{s{x), y), minus{s{x) , y) yfp„ s{x) and minus{s{x) , y) yfp„ y. The last two fol- 
low from case 1 of SPO. For the hrst one we have minus{s{x) , y) yq le{s{x), y), 
since I{minus{s{x),y) — s(a?) yrpo b — I{le{s{x),y)), and as before we can 
conclude by case 2. 



Example 6. The following system is an automatic translation of a prolog pro- 
gram that computes the Ackermann function. This example comes from Claus 
Claves Master’s thesis, in the context of the development of TALP [OCMOO], an 
automated termination proof tool for logic programs based on rewriting tech- 
niques ^ . This TRS can be proved terminating by the dependency pair method 
(see section 5.2), but using a further rehnement of the dependency graph intro- 
duced in [AG98] (see also [AGOO]). 



^ This example was posed to us by Claude Marche. 
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As shown below, this example can be easily handled by MSPO with a very 
simple quasi-reduction pair. 

ackJn{0, n) —7- ack -Out{s{n)) 

ack Jn{s{m) , 0 ) —7- ull{ackJn{m, s( 0 ))) 

ull{ack J3ut{n)) —7- ack -Out{n) 

ack Jn{s{m) , s{n)) —7- u21{ack Jn{s{m) , n), m) 

u21{ackj3ut{n),m) —7- u22{ackJn{m,n)) 

u22{ack J3ut{n)) —7- ack -Out{n) 

We can take as Aq the lexicographic combination with the 

precedence ackJn ack-out, ackJn yjr ull, ackJn yjr s, ackJn yjr s, 
ackJn u21 and u21 yjr u22; and with the renaming map N which is the 
identity except N{ackJn) — AckJn and N{u21) — U21. For Aj, the inter- 
pretation is given by [ack -Out{x) , a) , [ack Jn{x , y) , a) , {Ack Jn{x , y) , acku{x)) , 
[u21{x, y),a), {U21{X, Y), acku{Y)) , a), and {u22{X), a) and the basic 
quasi-ordering is RPO generated by the precedence s yp a and 0 yp a. 

5 Generalizing Other Termination Proof Methods 

As an application of the provided methods to generate suitable quasi-orderings 
yq, we will show how two known termination proof methods, namely dummy 
elimination [FZ95] and the dependency pairs [AG97], can be seen as a particular 
instance of the monotonic semantic path ordering. Note that as a side effect, 
this provides a new simple proof of their correctness. Finally we study Geser’s 
proposal and show that it is strictly weaker than ours. 



5.1 Dummy Elimination 

Dummy elimination consists of a transformation which eliminates function sym- 
bols from a signature replacing them by a constant (❖ in our notation); terms 
and rewrite rules are transformed accordingly. The soundness result states that 
a TRS R, defined over T[T U iG, A), where Ta contains symbols of arity > 1 
which are to be eliminated, is terminating if the transformed TRS E{R), defined 
over T{tF J {^}, A), is terminating. 

Let la be the interpretation defined by the pairs {g{xi, . . . , Xn), Og) for every 
symbol g to be eliminated (and the identity pair for all other symbols). The 
system E{R) is given by 

E{R) — {cap(0 — 7 - w |w e {cap(r)} U dec(r), f r £ R} 

where cap(s) = /«(») and dec(s) contains Ia{s^} for all s^ subterm of s just 
below a function symbol to be eliminated (for details see [FZ95]). For example, 
R — {f{f{x)) —7- f{g{f{x)))} is transformed, via the elimination of g, in the 
system E{R) — {f{f{x)) -P- f{o); f{f{x)) -y f{x)}, where ❖ is the constant 
replacing g. 
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In the following we show that whenever termination of R can be shown using 
dummy elimination, we can hnd a simple quasi-reduction pair (^j, ^q) s.t. R 
is contained in MSPO. 

Let be the reduction ordering containing E{R), and ^ be U =. Then 
we dehne by means of the interpretation R dehned above and ^ as basic 
quasi-ordering, i.e. s f iff Ia{s) ^ On the other hand, we dehne Eg 

as s Eg t iff Ia{s){y Ut >)* Note that if is a simplihcation ordering like 
RPO, the subterm relation is already included. 

Now we show that (^j, Eg) is a quasi-reduction pair. Since is a reduction 
ordering, ^ is a quasi-reduction quasi-ordering, and, by proposition 2, also 
is. As already said, since is a reduction ordering, we have that U>)* is a 

well-founded quasi-ordering stable under substitutions, and since C yg, by 

propositions 3.1 and 4.1, we conclude that (^j, yg) is a quasi-reduction pair. 

Finally we show that if E{R) is contained in then R is contained in ymspo- 
Since cap(0 -y cap(r) £ E{R) we have Ia{l} — cap(0 cap(r) = /a(r) 
and hence I Ei r. To show that I yfp„ r we prove that I yfp„ r' for all r' 
subterm of r, by induction on |ri|. We have that u \> cap(ri) for some u £ 
{cap(r)} Udec(r). By dehnition, cap(0 — w £ E{R), and hence cap(0 u. 
Consequently, since the strict part of U>)* includes we have Ia{l} — 

cdLp(.l){y U>)"*“cap(rO = Ia{r') and thus, I yg r' . By induction hypothesis 
I yfpg r'f for all arguments r) of r' , and therefore, by case 2, we have I yfp„ r' . 

Recently, the argument filtering transformation method [KNT99] has been 
proposed. The basic idea of this method is the same as in dummy elimination but 
using more general term transformations, like the AFS in the dependency pair 
method, or the ones presented here. In fact, they coincide with the interpretations 
we have given but requiring that / and /j are the same symbol. Therefore, we 
can prove, in the same way as for dummy elimination, that this method is a 
particular instance of MSPO, since we can take and yg as before, but with 
this new interpretations. 

Finally, let us remark that, by using interpretation from terms to multisets 
of terms, we can also show that the distribution elimination method [Zan94] is a 
particular case of MSPO. In our case the restrictions imposed on this method to 
be correct are necessary to assure that and yg are stable under substitutions. 

5.2 Dependency Pairs 

We consider here the plain dependency pair method, i.e. the method without 
using what is called the dependeney graph rehnement (see [AG97,AG00] for de- 
tails) . In section 6 some ideas about how this rehnement can be incorporated to 
our method will be given. 

In this method for a given TRS R the signature E is split into two sets: the 
constructor symbols set C and the dehned symbols set D. Dehned symbols are 
those heading the left hand side of a rule in R and constructor symbols are all 
others. 

Let N be a renaming map in E dehned as N{f) — E for all symbols f (z D 
and the identity for the others. The dependency pairs of a rule I ^ r in R 
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is the set of pairs {N (1) , N (r^)) for every subterm r' of r headed by a symbol 
in D. For example, the rule f(f(x)) — 7 - figifix))) has the dependency pairs 

Then R is terminating if there is a quasi-reduction quasi-ordering ^ such 
that for every rule / — r we have I F r and for every dependency pair {s,t) 
of / — 7 - r we have s y t. Note that in practice these conditions are expressed 
as an ordering constraint, called in the rest of the paper the dependency pair 
constraint. 

In the following we show that whenever termination of R can be shown using 
dependency pairs, we can a find a simple quasi-reduction pair {Fi, Fq) obtained 
from the ordering used by the dependency method, s.t. R is contained in MSPO. 

Let be a precedence on F where / ^ iff / £ (hence / yy^ ^ iff / £ 

and g (E C). Then we take as ^ and yq as , Fj )iex- By propositions 3 
and 4(2 and 3), we have that (^j, yq) is a quasi-reduction pair. 

Finally we show that R is contained in ymspo- We have directly that I yj r 
since f ^ r in the dependency pairs proof. To show that I yfp„ r we prove 
that I yfpg r' for all r' subterm of r, by induction on |ri|. If top(r') £ C then 
topil) yyp top{r') and hence I yq r' . Otherwise topil) y^r top{r') and there is 
a dependency pair {N(l),N(r')), s.t. N(l) y N(r'), which implies I yf^ r' , and 
hence I yq r' . Since, by induction hypothesis I yfp„ r( for all arguments r( of 
r' , by case 2 of SPO, we have I yfp„ r' . 

Note that we have only used case 2 of SPO for proving that R is contained 
in ymspo- Moreover, note that the precedence we have used to build yq is quite 
weak in its strict part. We believe that by using better precedences as a first 
component of yq, e.g. adding strict comparisons between the defined symbols, 
we can capture easily part of the power of the dependency graph (see section 6 ) . 



5.3 Geser’s Monotonic Semantic Path Ordering 

Now we analyze Geser’s proposal [Ges92] for a monotonic SPO. We give here 
the strict part of his definition. 

Definition 6. Let yq be a quasi-reduction quasi-ordering. 



sycjt iff sy^p^t and f{...,s,..)yq for all f 

Although this version is an important step in the right direction, it has two 
main weaknesses. First of all, the requirement of Gq to be a quasi-reduction 
quasi-ordering is too strong. As shown in example 2, this makes SPO loose 
completeness, and hence Geser’s proposal cannot be complete. Furthermore, in 
a more practical view, with such a Gq neither the dummy elimination technique 
nor the dependency pair technique can be included (note that, for instance, the 
renaming mapping does not preserve quasi-monotonicity). On the other hand, 
with respect to efficiency of an implementation of the method, a termination 
proof requires a huge number of comparisons with yq , since for every rule I ^ r, 
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every symbol / and every argument position of this symbol 

r, .. .) has to be checked. Note that instead we only have to check I r. 
Now we show that yo is included in ymspo- 

Lemma 2. Let yq be a quasi-reduetion quasi-ordering. There exists some 
s.t. (^j, yq) is a quasi-reduetion pair and yo C ymspo- 

Proof. We take as s t iff s, . . .) yq f, .. .) for all f ^ iF.lt is 

obvious that in this case ya C ymspo- Then, we have to prove that {yi,yq} 
is a quasi-reduction pair, that is, is quasi-monotonic on yq, which follows 
by dehnition of ^j, and is a quasi-monotonic quasi-ordering stable under 
substitutions, which follows directly from the properties of □ 

6 Constraints 

Using MSPO, we can translate our termination problem into an ordering eon- 
straint solving problem (similar to the ones given in [Com90] for lexicographic 
path ordering, except that here variables are universally quantihed), which is, 
in general, more suitable for automation. This translation is simply based on 
applying the dehnition of MSPO, and SPO, to the rules of the TRS. 

Let i? be a set of rules {h -P- r{ \ 1 < i < n} . We consider the following initial 
MSPO-constraint : 

l\ y mspo r'l A ... A In. y mspo '^'n 

Which is transformed by applying the dehnition of MSPO into the conjunction 
of two constraints Ic and SPOcj 

Ic - h yi ri A . . . Ain Pi rn 
SPOc : h y%o riA...Aln y%n rn 

Now the dehnition of SPO is applied to the second part of the constraint. This 
is formalized by means of constraint transformation rules: 

« >l%o f T iis = t 

s hlpo i ^ s yfp„ t ils^t 

^ y%o i ^ -L 

s yfpn X T if s ^ X E Vars{s) 

S — f{si ; • • • ; Sm) U g{ll ; • • • ; C) = I b 

>Z?po tv ...VSm y?po tv 
(syqtAs yfp„ tiA...As yfp„ f„)v 
{s yqt A{si,...,Sm} yy?po{tl, ■ ■ ■ ,tn}) 

Where {si, . . . , yy?po{tir ■ ■ ■ Un} translated into a constraint over yfpg 
and yfp„. 

It is easy to see that these transformation rules are terminating and conhuent. 
Moreover, the resulting normal form is an ordering constraint over yq and yq. 
Then after computing the disjunctive normal form, the initial constraint SPOc 
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has been translated into a disjunction of constraints over yq and yq each one 
of the form 



Qc ■ Si yq ti A . . . A Sp yq tp A Aq t[ A . . . A Aq 
where none of the terms are variables. 

Now we have to find a reduction pair {Ai,Aq} satisfying Ic and one of 
these constraints Qc- Note that, from what we have seen in section 5.2, the 
dependency pair constraint can be obtained by applying always case 2 of SPO, 
which means that some Qc obtained above represents this path. Since there are 
several possible Qc, it may happen, as in example 6, that the path chosen by 
the dependency pair method is not the easiest one. 

To solve the constraints Ic and Qc some simplification techniques are neces- 
sary. A first simple example of such a simplification, is obtained by considering 
that Aq is a lexicographic combination , Am) with (Aj, Am) being a quasi- 
reduction pair. Then we take the constraint Qc, and define top{s) Ayr topit) iff 
either s Aq t oi s Aq t is in Qc - Now, if Ayr is the strict part of Ayr and its 
equivalence, we can simplify the constraint Qc into Me by the following rules: 

s Aqt T if top{s) Ayr top{t) 

s Aq t T if top{s) Ayr topit) 

s Aqt s Am t if top{s) ^yr top{t) 

s Aq t s Am t if top{s) top{t) 

At this point, in general, it is interesting to define Am by means of a renaming 
N of all symbols heading s or f in some s Am t oi s Am t in Me, which leads 
to a final renamed constraint called Nc ■ 

Solving the constraints Ic and Nc, i.e., finding an adequate quasi-reduction 
pair, is very similar to solve the dependency pair constraints. The simplest so- 
lution is to consider the quasi-reduction pair {Ay, Ay), but we believe that, if 
necessary, we will be able to use refinements like the dependency graph in our 
constraint solver. 

7 Modularity 

In this section we present some modularity results for MSPO which are obtained 
applying known abstract sufficient conditions ensuring modularity of termina- 
tion. We consider disjoint unions of TRSs, i.e., systems that do not share any 
symbol, and constructor sharing unions of TRSs, i.e., systems which share only 
constructors. 

A TRS R is called terminating under non-deterministic collapses, denoted 
(Jf-terminating, if R U {G{x, y) -A x, G{x, y) -A y} terminates for some new 
symbol G. For (Jj -termination we have the following results: 

— [Ohl94] (Jf -termination is a modular property for disjoint unions of TRSs. 

— [Gra94] (Jj -termination is a modular property for constructor-sharing unions 
of finite TRSs. 
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Lemma 3 . R is included in ymspo then R is -terminating. 

Note that since includes the subterm relation and G is a new symbol, 
we have I{G{x,y)) — G{x,y) and hence G{x,y) Gj x and G{x,y) Gj y. There- 
fore, since by case 1 both rules are included in SPO, we can conclude that 
G{Xj y) )>- mspo ^ and G{xj y) )>- mspo V’ 

Corollary 1 . Let l^mspo MSPO whose basic orderings include 

the subterm relation and let R\ and R2 be TRSs that are included in y^spo 
Pmspo respectively. 

— If Ri and R2 are disjoint then R\ U i?2 Is -terminating, and thus termi- 
nating. 

— If Ri and R2 share only constructors and are finite then R\ U R2 is ex- 
terminating, and thus terminating. 

Similarly in [G098] modularity results for disjoint and constructor sharing 
unions of TRSs proved terminating using the dependency pair method are pre- 
sented. For disjoint systems the results are the same, but for constructor sharing 
unions the restriction is not imposed on the hniteness of the systems but on the 
treatment of the shared symbols in the termination proof. 

8 Conclusion 

In this paper we have described a new ordering-based general method for proving 
termination of TRSs. MSPO is based on the well-known SPO, but unlike SPO 
it is monotonic, which makes it useful in practice. It is a complete method, i.e., 
it characterizes termination. The method generalizes, in a simple way, many 
known methods based on transformations. In the case of the dependency pairs 
method, which is by now one of the most successful general methods applied 
in practice, we have only shown that we generalize it without the “dependency 
graph” rehnement. These kind of “operational” rehnements do not ht so well 
in our framework, but, by considering the termination of TRSs as an ordering 
constraint solving problem over MSPO, we believe that, if necessary, we will be 
able to use these rehnements in our constraint solver. On the other hand, due to 
its additional hexibility, we have seen that MSPO provides, in some cases, simpler 
termination proofs than the dependency pair method. Thus, in order to study 
the behavior of MSPO, we are developing a termination system based on this 
method. Currently this system can only check termination once the ingredients 
for the MSPO are provided, but our aim is to fully automate the termination 
proofs, by using some heuristics for selecting ingredients to be tried. 

Besides its application to hrst-order term rewriting, the fact that the method 
is dehned by means of orderings, for which the properties to be fulhlled are well- 
known, opens the door to other important classes like AC-rewriting (i.e., rewrit- 
ing modulo associativity and commutativity axioms) or higher-order rewriting, 
for which the lack of general methods is more important. The idea is to combine 
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with MSPO some recent results obtained for RPO in the AC-case [Rub99] and 
in the HO-case [JR99] , since all them share the same structure and are based on 
orderings. 

Finally, apart from the presented modularity results, we are studying other 
kinds of combinations. In particular, we are interested in the so called hierarchical 
combinations (see [AG98] related results for the dependency pair method) , since 
reusing termination proofs or proving the termination of the TRS by splitting it 
in different parts may be crucial in practice. 
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Abstract. We introduce a calculus of stratified resolution, in which spe- 
cial attention is paid to clauses that “define” relations. If snch clauses 
are discovered in the initial set of clauses, they are treated using the 
rnle of definition unfolding, i.e. the rnle that replaces defined relations 
by their definitions. Stratified resolution conies with a new, previously 
not stndied, notion of rednndancy: a danse to which definition unfold- 
ing has been applied can be removed from the search space. To prove 
completeness of stratified resolution with redundancies we use a novel 
techniqne of traces. 



1 Introduction 

In this article we introduce two versions of stratified resolution, — a resolu- 
tion calculus with special rules for handling hierarchical definitions of relations. 
Stratified resolution generalizes SLD-resolution for Horn clauses to a more gen- 
eral case, where clauses may be non-Horn but “Horn with respect to a set of 
relations” . 

Example 1 Suppose we try to establish inconsistency of a set of clauses S 
containing a recursive definition of a relation split that splits a list of conferences 
into two sublists: of deduction-related conferences, and of all other conferences. 

split{[x\yl , \.x\z],u) deduction{x), split{y, z,u). 
split{ \.x\y'] , z, \_x\u'\) : — < deduction (x), split{y, z, u). 
spht{U, [],[]). 



Suppose that S also contains other clauses, for example 
~^split{x, y, z) V conference -list (x). 

If we use ordered resolution with negative selection (as most state-of-the-art 
systems would do), we face several choices in selecting the order and negative 
literals in clauses. For example, if we choose the order in which every literal with 
the relation deduction is greater than any literal with the relation split, then we 
must select either ^deduction(x) or ^splitfy, z,u) in the first clause. It seems 
much more natural to select split { [a; I y] , \.x\z1, u) instead, then we can use the 
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first clause in the same way it would be used in logic programming. Likewise, 
if we always try to select a negative literal in a clause, the literal ^split{y, z, u) 
will be selected in the second clause, which is most likely a wrong choice, since 
then any resolvent with the second clause will give us a larger clause. 

Let us now choose an ordering in which the literals split{\.x\y] , \.x\z'],u) 
and split{\.x\y] , z, [a;|u]) are maximal in their clauses, and select these liter- 
als. Consider the fourth clause. If we select ^split{x, y, z) in it, we can resolve 
this literal with all three clauses defining split. It would be desirable to select 
conference -list (x) in it (if our ordering allows us to do so), since a resolvent 
upon conference -list (x) is likely to instantiate a; to a nonvariable term t, and 
then the literal ^splitft, y, z) can be resolved with only two, one or no clauses at 
all, depending on the form of t. 

In all cases, it seems reasonable to choose an ordering and selection function 
in such a way that the first three clauses will be used as a definition of split so that 
we unfold this definition, i.e. replace the heads of these clauses with their bodies. 
Such an ordering would give us the best results if we have a right strategy of 
negative selection which says: select ^split{t, r, s) only if t is instantiated enough, 
or if we have no other choice. 

In order to implement this idea we have to be able to formalize the right 
notion of a “definition” in a set of clauses. Such a formalization is undertaken in 
our paper, in the form of a calculus of stratified resolution. Stratified resolution is 
based on the following ideas that can be tracked down to earlier ideas developed 
in logic programming. 

1 . Logic programming is based on the idea of using definite clauses as definitions 
of relations. Similar to the notion of definite clause, we introduce a more 
general notion of a set of clauses definite w.r.t. a set of relations. These 
relations are regarded as defined by this set of clauses. 

2. In logic programming, relations are often defined in terms of other relations. 
The notion of stratification [5,1,8] allows one to formalize the notion “P is 
defined in terms of Q”. We use a similar idea of stratification, but in our 
case stratification must be related to a reduction ordering on literals. 

Consider another example. 

Example 2 The difficult problem is to find automatically the right ordering 
that makes the atom in the head of a “definition” greater than atoms in the 
body of this definition. Consider, for example, clauses defining reachability in a 
directed graph, where the graph is formalized by the binary relation edge: 

reachable{x , y) edge{x,y). 

reachable{x , z) : - edge{x, y), reachable {y, z). 

There is no well-founded ordering stable under substitutions that makes the atom 
reachable{x , z) greater than reachablefy, z). So the standard ordered resolution 
with negative selection cannot help us in selecting the “right” ordering. The 
theory developed in this paper allows one to select only the literal reachable{x, z) 
in this clause despite that this literal is not the greatest. 
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In addition to intelligent selection of literals in clauses, stratified resolution 
has a new kind of redundancy, not exploited so far in automated deduction. To 
explain this kind of redundancy, let us come back to Example 1. Suppose we 
have a clause 



-^split{\.cade,www,lpar'\ ,y, z). (1) 

Stratified resolution can resolve this clause with the first two clauses in the 
definition of split, obtaining two new clauses 

deduction{cade) , split{ \.www, lpar \ , y, z); 

~^deduction{cade), split{ Iwww, lpar '\ , y, z). 

In resolution-based theorem proving these two clauses would be added to the 
search space. We prove that they can replace clause (1) thus making the search 
space smaller. 

When the initial set of clauses contains no definitions or cannot be strati- 
fied, stratified resolution becomes the ordinary ordered resolution with negative 
selection. However, sets of clauses which contain definitions and can be strati- 
fied in our sense are often met in practice, since they correspond to definitions 
(maybe recursive) of relations of a special form. For example, a majority of 
TPTP problems can be stratified. 

This paper is organized as follows. In Section 2 we define the ground version of 
stratified resolution and prove its soundness and completeness. Then in Section 3 
we define stratified resolution with redundancies, a calculus in which a clause can 
be removed from the search space after a definition unfolding has been applied 
to it. Then in Section 4 we define a nonground version of stratified resolution 
with redundancies. 

In this paper we deal with first-order logic without equality, we will only 
briefly discuss equality in Section 6. 



Related Work 

There are not so many papers in the automated deduction literature relevant 
to our paper. Our formal system resembles SLD-resolution [6]. When the initial 
set of clauses is Horn, our stratified resolution with redundancies becomes SLD- 
resolution. The possibility for arbitrary selection for Horn clauses, and even in 
the case of equational logic was proved in [7]. For one of our proofs we used a 
renaming technique introduced in [2,3]. 

2 Stratified Resolution: The Ground Case 

As usual, we begin with a propositional case, lifting to the general case will be 
standard. 

Throughout this paper, we denote by £ a set of propositional atoms. The 
literal complementary to a literal L is denoted by L. For every set of atoms V, 
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we call a V-atom any atom in V, and a V-literal any literal whose atom is in V . 
A clause is a finite multiset of literals. We denote literals by L, M and clauses 
by C,D, maybe with indices. The empty clause is denoted by □. We use the 
standard notation for multisets, for example write C\, C'2 for the multiset union 
of two clauses Ci and C'2, and write L G C if the literal L is a member of the 
clause C. For two ground clauses Ci and C2, we say that Ci subsumes C'2, if Ci 
is a submultiset of C2. This means, for example, that A, A does not subsume A. 

Let be a total well-founded ordering on C. We extend this ordering on 
literals in C in the standard way, such that for every atom A we have ~^A >- A 
and there is no literal L such that ^A >- L >- A. If L is a literal and C is a 
clause, we write L y C if for every literal M G C we have L >- M. Usually, we 
will write a clause as a disjunction of its literals. As usual, we write L > L' if 
L L' or L = L' and write L ^ C if for every literal M G C we have L ^ M. 
In this paper we assume that always denotes a fixed well-founded ordering. 

We will now define a notion of selection function which makes a slight de- 
viation from the standard notions (see e.g. [ 3 ]. However, the resulting inference 
system will be standard, and several following definitions will become simpler. 
We call a selection function any function a on the set of clauses such that (i) 
(j(C) is a set of literals in C, (ii) cr{C) is nonempty whenever C is nonempty, 
and (iii) if A G o’(C) and A is a positive literal, then A ^ C. If L G o’(C), we 
say that L is selected by a in C . When we use a selection function, we underline 
selected literals, so when we write A V C, this means that A (and maybe some 
other literals) are selected in A V C. 

In our proofs we will use the result on completeness of the inference system of 
ordered binary resolution with negative selection (but our main inference system 
will be different). Let us define this inference system. 

Definition 3 Let ct be a selection function. The inference system Bff consists 
of the following inference rules: 

1 . Positive ordered factoring: 

CV A V A 
CV A ’ 



where A is a positive literal. 

2. Binary ordered resolution with selection: 

CVA DM ^ A 
CM D 



where Am C. 

The following theorem is well-known (for example, a stronger statement can 
be found in [ 3 ]). 

Theorem 4 Let S be a set of clauses. Then S is unsatisfiable if and only □ is 
derivable from S in Bff . 
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The following two definitions are central to our paper. 

Definition 5 Let 7^ C £ be a set of atoms. A set S of clauses is Horn with 
respect to V, if every clause of S contains at most one positive 7^-literal. A clause 
D is called definite with respect to V if T) contains exactly one positive 7^-literal. 
Let P V Li V . . . V be a clause definite w.r.t. V such that P G V. We will 
sometimes denote this clause by P : - Pi, . . . , P„. 



Definition 6 (Stratification) We call a )^-straP/icaPon o/£ any finite sequence 
£n ^ y ^0 of subsets of £ such that 

1 . £ = £oU...U£u; 

2. If m > n, A G £m, and B G £„, then A>- B. 

We will denote )^-stratifications hy £n ^ ^ £o and call them simply strati- 

fications. 

From now on we assume a fixed stratification of £ of the form 

Qn >- Pn >- Qn-l >~ Pn-1 >~ ■ ■ ■ >~ Ql >~ Vl >- Qo- (2) 

We denote V = P„ U . . . U Pi and Q = Qn U . . . U Qo and use this notation 
throughout the paper. Atoms in P will be denoted by P (maybe with indices). 

Let C be a clause definite w.r.t. P. Then C contains a positive literal P G Pi, 
for some i. We say that C admits stratification (2), if all atoms occurring in C 
belong to Pi U Qi_i U . . . U Pi U Qq. Note that every such clause has the form 

P Pi , ■ - , Pk , Li, ■ y, Li^ 

atoms in Qi — i U . . . U Qq- 

Vi U . . . Li Vl literals 



Example 7 Consider the set consisting of four clauses: Av B, Av ~^B, ^Av B, 
and ~^A V ~^B. This set is Horn with respect to {A} and also with respect to 
{P}, but not with respect to {A, P}. This set of clauses admits stratification 
0 {A} {P}, in which A is considered as a relation defined in terms of P, 

but also admits 0 {P} {A}, in which P is considered as defined in terms of 

A. This example shows that it is hard to expect to find the “greatest” (in any 
sense) stratification. 

Let us fix a well-founded order a selection function a, and a > — stratification 

( 2 ). 

Definition 8 (Stratified Resolution) The inference system of stratified resolu- 
tion, denoted STZ, consists of the following inference rules. 
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1. Positive ordered factoring: 

Cy Aw A 
cy A ' 

where A is a positive literal, and C contains no positive 7^-literals. 

2. Binary ordered resolution with selection: 

cyA py ^A 

cyp ’ 

where A y C and C, P, A contain no positive 7^-literals. 

3. Pefinition unfolding: 

cyp py^p 
cyp ’ 

where P G V and P contains no positive "P-literals. Note that in this rule 

we do not require that P be selected in C V P. 



Theorem 9 (Soundness and Completeness of Stratified Resolution) 

Let S he a set of clauses Horn w.r.t. V. Let, in addition, every clause in S 
definite w.r.t. V admits stratification (2). Then S is unsatisfiable if and only if 
□ is derivable from S in the system of stratified resolution. 

Proof. Soundness is obvious. To prove completeness, we use a technique of 
[3], (see also [2]) for proving completeness of resolution with free selection for 
Horn sets. Let us explain the idea of this proof. The inference rules of stratified 
resolution reminds us of the rules of ordered resolution with negative selection, 
but with a nonstandard selection function: in any clause P y C definite w.r.t. 
V the literal P is selected. We could use Theorem 4 on completeness of ordered 
resolution with negative selection if P was maximal in P V (7, since we could 
then select P in P V (7 using a standard selection function. However, P is not 
necessarily maximal: there can be other P-literals in C greater than P. What 
we do is to “rename” the clause P V (7 so that P becomes greater than C. 

Formally, let V' be a new set of atoms of the form P", where P G V, and 
n is a natural number. Denote by C the set of atoms V' U Q. We will refer to 
the set C' as the new language as opposite to the original language L. Define a 
mapping p : C — > C by (i) p(P") = P for any natural number n, (ii) p{A) = A 
if A G Q. Extend the mapping to literals and clauses in the new language in a 
natural way: p(^A) = ^p(A) and p{Li V ... V P„) = p{Li) V ... V p(P„). 

For every clause C Horn w.r.t. V of the original language we define a set of 
clauses C^ in the new language as follows. 

1. If (7 is definite w.r.t. P, then C has the form P V ^Pi, . . . , V^Pfc V P, where 
k >0 and P has no P-literals. We define C^ as the set of clauses 



j^pi+ni+...+rik Y V ... V V P I rii, . . . , rifc are natural numbers }. 
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2. Otherwise, C has the form ^Pi V ... V ^Pk V D, where k > 0 and D has no 
7^-literals. We define as the set of clauses 

V ... V ^Pk^ y D I m, . . . , rifc are natural numbers }. 

For any set of clauses S Horn w.r.t. V we define as the set of clauses UceS 
We prove the following: 

(3) S' is satisfiable if and only if so is . 

Suppose S is satisfiable, then some valuation r : C — > {true, false} satisfies S. 
Define a valuation r' : £' — > {true, false} such that t'{A) = t{A), if H S Q, 
and t'(P^) = t{P). It is not hard to argue that r' satisfies S^. 

Now we suppose that S is unsatisfiable and show that is unsatisfiable, 
too. We apply induction on the number k of Q-literals occurring in S. 

1. Case k = 0. Then S is a set of Horn clauses. This case has been considered in 
[3] . (The idea is that P” is interpreted as P has derivation by SLD-resolution 
in n steps). 

2. Case k > 0. For any set of clauses T and literal L, denote by TjL the set of 
clauses obtained from T be removing all clauses containing L and removing 
from the rest of clauses all occurrences of literals L. Note that if L is a Q- 
literal occurring in T, then TfL contains less Q-literals than T. It is not 
hard to argue that T is unsatisfiable if and only so are both T /L and T /L. 
Take any Q-literal L occurring in S. Since S is unsatisfiable, so are both S/L 
and S/L. By the induction hypothesis, {S/L)p and {S/L)p are unsatisfiable, 
too. It is easy to see that {S/L)p = /L and {S/L)p = S^/L, so both S^/L 
and SP /L are unsatisfiable. But then is unsatisfiable too. 

The proof of (3) is completed. Let us continue the proof of Theorem 9. Define 
an order y' on C as follows: 

1. and coincide on Q-literals. 

2. If L is a Q-atom and P is a P-atom, then for every n we let L y' P” if and 
only if L >- P. 

3. Pfi p://^ if Pi e Vi, P 2 e Vj and i > j. 

4. P"^ P^^ if Pi and P 2 belong to the same set Vi and ni > ri 2 . 

5. P” >-' Pf/ if Pi and P 2 belong to the same set Vi and Pi P P 2 - 

Using the selection function a on clauses of the original language we will now 
define a selection function a' on clauses of the new language. We will only be 
interested in the behavior of a' on clauses Horn w.r.t. V' , so we do not define it 
for other clauses. 

1. If a clause C is definite w.r.t. V' , then a' selects in C the maximal literal. 

2. If C contains no positive P-literals, then a' selects a literal P in C if and 
only if a selects p{L) in p(C). 
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Note that our ordering is defined in such a way that for any clause (P” V C) G 
definite w.r.t. V' we have P” C, and hence P” is always selected in this 
clause. 

It is not hard to argue that a' is a selection function. To complete the proof 
we use Theorem 4 on completeness of the system R^, applied to S^. 

Consider a derivation of a clause D' from with respect to P^, . We show 
simultaneously that: 

(4) there exists a derivation of the clause D = p{D') from S in the system 
STZ; 

(5) if D' ^ , then D' contains no positive P'-literal. 

We apply induction on the length of the derivation of D' . If D' G S^, then 
p{D') G S, so the induction base is straightforward. Assume now that the deriva- 
tion of D' is obtained by an inference from D[, , D'^. We will show that D can 
be obtained by an inference in STZ from p{D[), . . . , p{D'^), this will imply (4), 
since p(P'i), ■ ■ ■ , p{D'^) can be derived in STZ from S by the induction hypothesis. 
Consider the following cases. 

1. Case: D' = C \J A is derived from C" V A V A by positive ordered factoring. 
Then A C'. Note that A cannot be a P'-atom: C' V P" V P” ^ 
because all clauses in are Horn w.r.t. P', and C V P” V P” cannot be 
derived by the induction hypothesis. Therefore, A is a Q-atom, and hence 
p{C' V A V A) = p(C") V A V A. 

Moreover, C contains no positive literal P” because in this case the clause 
(p(C') V A V A) G S' is definite w.r.t. P, so it admits the stratification, so 
P > A, which contradicts to A P”. 

Therefore, C \! A contains no positive P'-literal. Then by our definition of 
a' and A is maximal and selected in p(C") V A V A, so we can apply 
positive ordered factoring and derive p(C") V A in STZ. It is easy to see that 
p(C") V A = p(C' V A) and that p(C' V A) contains no positive P'-literal. 

2. Case: D = C[ \/ C'2 is obtained from C[\J A and C'2 V ^A by binary ordered 
resolution with selection. Denote C\ = p{C[) and C'2 = ^(<^2). Consider two 
cases. 

(a) Case: A G Q. Then p(A) = A. In this case we show that 

C\ V A C2 V ^A 
Cl VC2 

is an inference in STZ by binary ordered resolution with selection. 

Since A C[, then A Ci. By our definition of a', A (^A) is selected 
by (T in Cl V A (in C2 V ^A) because A (^A) is selected by a' in Ci V A 
(C2 V ^ A) . It remains to check that Ci , C2 contain no positive P-literals. 
This is done exactly as in case of positive ordered factoring. 

(b) Case: A = P”. We prove that 

Cl V P C2 V ^ 



Cl VC2 
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is an inference in STZ by definition unfolding. Indeed, since is se- 
lected by o' in C'^ V then ^P is selected by ct in C 2 V ^P. 

As in the previous cases, we can prove that C '-^ , C'^ contains no positive 
literal of the form P™. 

Now we can conclude the proof of Theorem 9. Suppose that S in unsatisfiable. 
By (3), is unsatisfiable. Then by Theorem 4 the empty clause □ is derivable 
from in P^, . Hence, by (4) p(n) is derivable from S in SIZ. But p(n) = □, 
so □ is derivable from S in SIZ. 

We will now show that the condition on clauses to admit stratification is 
essential. 

Example 10 This example is taken from [7]. Consider the following set of 
clauses: 

V P ^P V Q V 

V ^P ^P V ^P 

^PVP PVQVP 

This clause is unsatisfiable and definite w.r.t. P. Consider the ordering P 
Q >- P. However, the empty clause cannot be derived from it, even if tautologies 
are allowed. Indeed, the conclusion of any inference by stratified resolution is 
subsumed by one of the clauses in this set. The problem is that the clause 
P V P V Q admits no )^-stratification. 

3 Redundancies in Stratified Resolution 

In this section we make stratified resolution into an inference system on sets of 
clauses and add two kinds of inferences that remove clauses from sets. Then we 
prove completeness of the resulting system using a new technique of traces. 

The derivable objects in the inference system of stratified resolution with 
redundancies are sets of clauses. Each inference has exactly one premise S and 
conclusion S' , we will denote such an inference by S>S' . We assume an ordering 
stratification and selection function be defined as in the case of STZ. 

In this section we always assume that Sq is an initial set of clauses Horn 
w.r.t. P and having the following property: for every P G P Sq contains only a 
finite number of clauses in which P is a positive literal. 

Definition 11 (Stratified Resolution with Redundancies) The inference 
system of stratified resolution with redundancies, denoted SP.P consists of the 
following inference rules. 

1. Suppose that (C V A V A) € S and A is a positive literal. Then 

A}, 

is an inference by positive ordered factoring. 
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2. Suppose that {C V A, D V C S, where A >- C and C, D, A contain no 
positive "P-literals. Then 

S' o S' U {C V D} 

is an inference by binary ordered resolution. 

3. Suppose {C V ~^P ) e S, where P G V. Furthermore, suppose that P V 
Di, . . . , P y Dk are all clauses in S containing P positively. Then 

S o (S - {C V ^}) U {C V Di, . . . , C V D„} 

is an inference by definition rewriting. Note that this inference deletes the 
clause C V ~^P . Moreover, if S contains no clauses containing the positive 
literal P, then C V ^P is deleted from the search space and not replaced by 
any other clause. 

4. Suppose {C, D} G S, C D and C subsumes D. Then 

SoS-{D} 



is an inference by subsumption. 

For the new calculus, we have to change the notion of derivation into a new 
one. We call a derivation from a set of clauses Sq any sequence of inferences: 



Sq\> Si\> S2t> . . . , 



possibly infinite. The derivation is said to succeed if some Si contains □, and fail 
otherwise. 

Consider a derivation S = i> c> — The set is called the 

limit of this derivation and denoted by lims . The derivation is called fair if the 
following three conditions are fulfilled: 

1. If (C V A V A) belongs to the limit of S and A y C, then some inference in 
S is the inference by positive ordered factoring 

SiOSiUjCV A}. 

2. If C V A, D V ^A belong to the limit of S, A C and C, D, A contain 
no positive 7^-literals, then some inference in S is the inference by binary 
ordered resolution 

Si o Si U {C V £>}. 

3. Let a clause CV ^P belongs to the limit of S and P G V. Then some inference 
in S is the inference by definition rewriting 



s,t>{s,-{cy^})u{cy Di,...,cy Dr^}. 
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We call a selection function a subsumption-stable if it has the following prop- 
erty. Suppose C subsumes D, L is a, literal in D selected by a and L G C. Then 
L is also selected in C by a. It is evident that subsumption-stable selection 
functions exist. 

Theorem 12 (Soundness and Completeness of STZTZ) Let So be a set of clauses 
Horn w.r.t. V and every clause definite w.r.t. V in Sq admits stratification (2). 
Let a be a subsumption-stable selection function. Consider derivations in STZTZ. 
(i) If any derivation from So succeeds, then So is unsatisfiable. (ii) If So is 
unsatisfiable, then every fair derivation from S succeeds. 

Proof. Soundness is easy to prove. Let a derivation c> . . . i> succeed. It is 
easy to see that every clause in every Si is a logical consequence of So. Then □ 
is a logical consequence of ^o, and hence is unsatisfiable. 

To prove completeness is more difficult. We will introduce several notions, 
prove some intermediate lemmas, and then come back to the proof of this theo- 
rem. 

The main obstacle in the proof of completeness is that some clauses can be 
deleted from the search space, when we apply definition rewriting or subsump- 
tion. If we only had subsumption, the proof could be done by standard methods 
based on clause orderings. However, clause rewriting can rewrite clauses into 
larger ones. For example, consider a set S of clauses that contains two clauses 
definite w.r.t. "P: Pi : - P 2 and P 2 P\. Then the following is a valid derivation: 

S U {^Pi} t> S' U {^P2} t> S' U {^Pi} [>..., 

so independently of a choice of ordering one of these inferences replaces a 
smaller clause by a bigger one. 

To prove completeness we will use a novel technique of traces. This technique 
was originally introduced in [10], where it was used to prove completeness of a 
system of modal logic with subsumption. Intuitively, a trace is whatever remains 
of a clause when the clause is deleted. Unlike [10], in this paper a trace can be 
an infinite set of clauses. 

Suppose that is the set of initial clauses, Horn w.r.t. P. Consider any 
clause D that contains no positive P-literals. Consider the following one-player 
game that builds a tree of clauses. Initially, the tree contains one root node D. 
We call a node C closed if either C = □ or the literal selected in C by <t is a 
Q-literal. An open node is any leaf that is not closed. At every step of the game 
the player has two choices of moves: 

1. (Subsumption move). Select any leaf C (either closed or open) and add to C 
as the child node any clause C such that C subsumes C and C' ^ C. 

2. (Expansion move). Select an open leaf C. If the literal selected in C by ct 
is a negative P-literal ^P, then C has the form ^P V C . (As we shall see 
later, no clause in the tree can contain a positive P-literal, so in this case a 
negative P-literal is always selected). Let all clauses in So that contain the 
positive literal P be P V Pi , . . . , P V P„. Then add to this node as children 
all nodes C" V Pi , . . . , C" V P„. 
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Using the property that Sq is Horn w.r.t. V it is not hard to argue that no clause 
in the tree can contain a positive 7^-literal, so an open leaf always contains a 
selected negative 7^-literal. 

A game is fair if every non-closed node is selected by the player. Let us call 
a tree for D any tree obtained as a limit of a fair game. We call a cover for D 
the set of all closed leaves in any tree T for D. We will say that the cover is 
obtained from the tree T and denote the cover by cIt- 

Example 13 Suppose that the set of clauses is 

{Pi V -^Qi I z = 1, 2, . . .} U {Pi V ~^Pi+i I z = 1, 2, . . .}, 

where all Pfs are 7^-literals and all Qfs are Q-literals. Suppose that the selection 
function cr always selects a 7^-literal in a clause that contains at least one V- 
literal. Then one possible tree for ^Pi V Qi is Eis follows: 



~^Pi V Qi 




/ \ 



In this tree no subsumption move was applied. The cover for ^Pi V Qi obtained 
from this tree is the set of clauses 

{ ^Qi V Qi I i = 1, 2, . . .}. 

Let us prove some useful properties of covers. 

Lemma 14 Every cover for a clause C is also a cover for any clause CV D. 

Proof. Let S' be a cover for C and T be the tree from which this cover is 
obtained. Apply a subsumption move to C V H 

C\/D 

C 

and extend this tree by the tree T below C. Evidently, we obtain a tree for CV D 
and the cover obtained from this tree is also S. 

The following straightforward lemma asserts that it is enough to consider 
trees with no repeated subsumption moves. 
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Lemma 15 For every cover S for a clause C there exists a tree T for C such 
that S = cIt and no subsumption move in T was applied to a node obtained by 
a subsumption move. 

Proof. Let T' be a tree for C such that S = cIt'- Suppose that T' contains a 
node obtained by a subsumption move so that a subsumption move is applied 
to this node as well: 



Di V D 2 V D 3 
D\ V D 2 

Di 



Evidently, these two subsumption moves can be replaced by one subsumption 
move: 



D\ V D 2 V D 3 
Di 



In this way we can eliminate all sequences of subsumption moves and obtain a 
required tree T. 

Let C be a clause and S be a derivation. We call a trace of C in S any cover 
S for C such that S C lims . When we speak about a trace of C in S we do not 
assume that C occurs in S. 

Lemma 16 Let S = So i> c> . . . be a fair derivation. Then every clause D 
occurring in any Si has a trace in S. 

Proof. We will play a game which builds a tree T such that cIt is a trace 
of H in S. We will define a sequence of trees Ti,T 2 , . . . such that each Tj+i is 
obtained from Tj by zero or more moves, and T is the limit of this sequence. 
The initial tree Ti consists of the root node D. Each tree is obtained from 
Tj by inspecting the inference Sj [>5^+1 as follows. 

1. Suppose this inference is a subsumption inference 



Sjt>Sj-{Cy D}, 



where C € S and (7 V H is a leaf node in Tj . Then make a subsumption move 
for each such leaf, adding C to it as the child. 
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2. Suppose this inference is a definition rewriting inference 

Sj > Sj - {C V ziE} U {C V Di, . . . , C V D„}, 

such that C V is a leaf in Tj . Then make an expansion move for each 
such leaf, adding to it the clauses C V Di, . . . ,C V as children. 

3. In all other cases make no move at all. 

Let us make several observations about the tree T and the game. 

1. Every leaf in any tree Tj is a member of some Sk- This is proved by induction 
on j. For T\ this holds by the definition of Ti, since it contains one leaf 
D G Si- By our construction, every leaf of any Tj+i which is not a leaf of Tj 
is a member of Sj+i. 

It remains to prove that every node which is a leaf in both Tj and be- 
longs to S'j+i. Take any such node C, by the induction hypothesis it belongs 
to Sj. Suppose, by contradiction, C ^ •Sy+i, then C has been removed by ei- 
ther subsumption or definition rewriting inference. But by our construction, 
in both cases C will become a nonleaf in Tj+i, so we obtain a contradiction. 

2. The game is fair. Suppose, by contradiction, that an open leaf has never 
been selected during the game. Then this leaf contains a clause C V ^P . Let 
Tj be any tree containing this open leaf. Then the leaf belongs to all trees 
Tk for k > j. By the previous item, C V ^P G Sk for all k > j. Therefore, 
the clause CW ^P belongs to the limit of S, but definition rewriting has not 
been applied to it. This contradicts to the fairness of S. 

3. Every closed leafC in T belongs to the limit ofS. Let this leaf first appeared 
in a tree Tj. Then it belongs to all trees Tk for k > j. By the first item, 
C G Sk for all k > j, but then C G lims. 

Now consider cIt. Since T was built using a fair game, cIt is a cover for D. But 
we proved that every element of cIt belongs to the limit of S, hence cIt is also 
a trace of D in S. 

Lemma 17 Let S = i> i> . . . be a fair derivation. Further, let E be a clause 
derivable from Sq in STZ. Then D has a trace in S. 

Proof. The proof is by induction on the derivation of E in STZ. When the 
derivation consists of 0 inferences, E G Sq. By Lemma 16 every clause of every 
Si has a trace in S, so if has a trace. 

When the derivation has at least one inference, we consider the last inference 
of the derivation. By the induction hypothesis, all premises of this inference 
have traces in S. We consider three cases corresponding to the inference rules of 
stratified resolution. In all cases, using Lemma 15 we can assume that no tree 
contains a subsumption move applied to a node which itself is obtained by a 
subsumption move. 




Stratified Resolution 



379 



1. The last inference is by positive ordered factoring. 

C\/ A\/ A 
cy A ' 

Consider a trace S'ofC'V^V^inS and a tree T from which this trace was 
obtained. Since A is selected by a in C, there are only two possibilities for 
T: 

(a) Case: T consists of one node C M Ay A. Then {C V ^ V 71} is a trace, 
and hence {Cy Ay A) € lims- Since S is fair, some Si contains the clause 
C y A obtained from C V ^ V ^ by positive ordered factoring. Then by 
Lemma 16, C y A has a trace in S. 

(b) Case: the first move ofT is a subsumption move. This subsumption move 
puts a clause C which subsumes C V 7l V ^ as the child to C V t1 V 7l. 
Consider two cases. 

i. Case: C has a form C" y Ay A, where C" subsumes C . Since the 
selection function is subsumption-stable, A is selected in C" V t1 V 
A by (T, so the tree T contains no node below C"' V t1 V Then 
{C" V ^ V t1) G lims. Since S is fair, some Si contains the clause 
C” V A obtained from C" V 7l V 7l by positive ordered factoring. By 
Lemma 16, C" V A has a trace in S. But C" V A subsumes C y A, 
so by Lemma 14 this trace is also a trace of C V 

ii. Case: C subsumes Cy A. Note that cIt is also a trace of C , so by 
Lemma 14 cIt is also a trace of C V 

2. The last inference is by binary ordered resolution with selection: 

cyA py ^A 
cy D ’ 

where A y C and C, D, A contain no positive 7^-literals. Consider traces 
Si, S 2 of Cy A and D V ^A, respectively and trees Ti, T 2 from which these 
traces were obtained. We consider two simple cases and then the remaining 
case 

(a) If the first move of T\ is a subsumption move that replaces C V ^ by a 
clause C that subsumes C, then C has a trace in S, but C subsumes 
C y D, so by Lemma 14 this trace is also a trace of C y D. 

(b) Likewise, if the first move of T 2 is a subsumption move that replaces 
H V by a clause D' that subsumes D, then D' has a trace in S, so 
by Lemma 14 this trace is also a trace oi C y D. 

(c) If neither of the previous cases takes place, then either T\ consists of one 
node or the top move in T\ is a subsumption move: 

CyA 



C'y A 
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such that C subsumes C. Note that in the latter case, since the selection 
function is subsumption-stable, A is selected in C'W A. In both cases the 
limit of S contains a clause C W A such that C subsumes C (in the 
former case we can take C as C). 

Likewise, we can prove that the limit of S contains a clause D' V ~^A such 
that D' subsumes D. Since S is fair, some Si contains the clause C V D' 
obtained from C \/ A and D' V ^A by binary ordered resolution with 
selection. By Lemma 16, C V D' has a trace in S. But C V D' subsumes 
C y D, so by Lemma 14 C V D has a trace in S. 

3. The last inference is by definition unfolding: 

cyp py^p 
cyp ’ 



where P G V. Consider a S' of D V and a tree T from which this trace 
was obtained. 

(a) If the first move of T is a subsumption move that replaces P y ^P hy a 
clause P' that subsumes P, then P' has a trace in S, so by Lemma 14 
this trace is also a trace of C V P. 

(b) If the previous case does not take place, there are two possibilities for 
the top move of T: it is either an expansion move or a subsumption move 
followed by an expansion move. We consider the former case, the latter 
case is similar. The top of the tree T has the form 






cyp 



Denote the subtree of T rooted at C V P by T' . Note that cIt' is a cover 
of C V P. But since T' is a subtree of T, we have cIt' C cIt- Since cIt 
is a trace, we also have cIt C lims, then cIt' Q lims, and hence cIt' is 
a trace. So C V P has a trace in S. 



The proof is completed. 

We can now easily complete the proof of completeness of stratified resolution 
with redundancies. 

Proof (of Theorem 12, Continued). Suppose Sq is unsatisfiable. Take any fair 
derivation S = Sq t> Si t> . . .. By Theorem 9 there exists a derivation of □ from 
clauses in Sq by stratified resolution. By Lemma 17, □ has a trace in S, i.e. a 
cover whose members belong to the limit of S. By the definition of a cover, □ has 
only one cover {□}. Then □ belongs to the limit of S, and hence S is successful. 

The proof can be easily modified for a system with redundancies in which 
clauses containing positive P-literals can also be subsumed and in which tau- 
tologies are deleted. 
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4 Nonground Case 



In this section we introduce the general, nonground, version of stratified resolu- 
tion with redundancies. 

Most of the definitions given below are obtained by a simple modification of 
the definition for the ground case. In this section we treat £, V, and Q as sets 
of relations but not atoms. To define stratification, we use a precedence relation 
i.e. a total ordering on C. We call a "P-literal any literal whose relation 
belongs to V, and similar for other sets of relations instead of V. The notions of 
Horn set w.r.t. V, definite clause w.r.t. stratification, and clause admitting a 
stratification are the same as in the ground case. 

Now we have to define an ordering corresponding to our notion of stratifi- 
cation. We require to be well-founded ordering on atoms, stable under substi- 
tutions (i.e. A > B implies A6 >- BO) and total on ground atoms. In addition, 
we require that is compatible with the precedence relation in the following 
sense: if Ai A^, then Afiti, . . . ,tn) >- A 2 {s\,. . Sm) for all relations Ai,A 2 
and terms t\, ... ,tn and s\, ... ,Sm- It is obvious how to modify recursive path 
orderings or Knuth-Bendix orderings so that they satisfy this requirement. 

In the definition of selection function for nonground case we revert in fact to 
one of earlier notions given in [4]. We define a (nonground) selection function to 
be a function ct on the set of clauses such that (i) cr(C') is a set of literals in C, 
(ii) cr(C') is nonempty whenever C is nonempty, and (iii) at least one negative 
literal or otherwise all maximal literals must be selected. Similar to the ground 
case we assume that selection functions are subsumption-stable in the following 
sense: if a literal L is selected in C V L and I? is a submultiset of C, then L is 
selected in D\/ L. 

Let us fix an order a selection function a, and stratification (2). 

Definition 18 (Stratified Resolution with Redundancies) The inference 
system of stratified resolution with redundancies, denoted STZTZ consists of the 
following inference rules. 

1 . Positive ordered factoring is the following inference rule: 

S'oS'U{(C'V4i)6»} 

such that (i) S contains a clause CV^V^, (ii) Ai,A 2 are positive literals, 
and (iii) 0 is a most general unifier of Ai and A 2 . 

2. Binary ordered resolution with selection is the following inference rule: 

S'o5U{(C'VD)6»} 

such that (i) S contains clauses C V ^ and D V ^A 2 , (ii) AiO ^ CO, (iii) 
C, D, Ai contain no positive P-literals, and (iv) 6* is a most general unifier 
of Ai and A 2 . 
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3. Definition rewriting. Suppose {C V ~^P ) G S, where P G V. Furthermore, 
suppose that PiV Di, . . . , PkV Dk are all clauses in S such that Pi is unifiable 
with P. Then 

St>{s-{cy^p})u{{cyDi)eu...,{cyDM, 

where each 9i is a most general unifier of P and Pi, is an inference by 
definition rewriting. 



Theorem 19 Let S he a set of clauses Horn w.r.t. V. Let, in addition, every 
clause in S definite w.r.t. V admit stratification (2). Then S is unsatisfiable if 
and only if every fair derivation from S in the system STZTZ contains the empty 
clause. 

The proof can be obtained by standard lifting with a special care of selection 
function. 



5 How to Select a Stratification 

Example 7 shows that a set of clauses may admit several different stratifications. 
How can choose a “good” stratification? 

Suppose that a set of clauses S is Horn w.r.t. V. Then we can always use the 
stratification 



Q, 

in which any P-atom is greater than any Q-atom. Unfortunately, this stratifi- 
cation may be not good enough, since it gives us too little choice for selecting 
positive Q-literals. Let us illustrate this for clauses of Example 1. Assume that 
V is {split}. We obtain the stratification 

0 {split} >- {deduction, conference Jist} . 

This stratification does not allow us to select the literal conference Jist (x) in 

~^split{x, y, z) V conference Jist (x), 

while intuitively it should be the right selection. 

This observation shows that it can be better to split the set of atoms into as 
many strata as possible. Then we will have more options for selecting positive 
Q-atoms in clauses. In Example 1, a more flexible stratification would be 

{conference Jist} >- {split} >- {deduction}. 

We are planning experiments with the choice of stratification using our the- 
orem prover Vampire [9] . 
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6 Conclusion 

We will mention several interesting open problems associated with stratified 
resolution. 

In our proofs, we could not get rid of an annoying condition on selection 
function to be subsumption-stable. 

Problem 1 Is stratified resolution with redundancies complete with selection 
functions that are not subsumption-stable? 

It is quite possible this can be done, but require a different proof technique. 

Problem 2 Find new techniques for proving completeness of stratified resolu- 
tion with redundancies. 



Problem 3 Find a powerful generalization of definite resolution for logic with 
equality. 

There is quite a simple generalization based on the following transformation 
of clauses definite w.r.t. V: we can replace any clause P(ti,...,t„) C by 
P{x \, . . . , Xn) ■- xi = ti, . . . ,Xn = tn,C. Then we can define stratified resolu- 
tion and prove its completeness in essentially the same way as before. However, 
in this case any clause containing ^P(si, . . . , s„) will be unifiable with the head 
of any clause defining P, so the gain of using definition rewriting is not obvious 
any more. 

The standard semantics of stratified logic programs are based on nonmono- 
tonic reasoning. Stratified resolution makes one think of a logic that combines 
nonmonotonic reasoning with the monotonic resolution-based reasoning. Such a 
logic, its semantics and ways of automated reasoning in it, should be investi- 
gated. So we have 

Problem 4 Find a combination of stratified resolution with nonmonotonic log- 
ics. 



Stratified resolution is different from ordered resolution with negative selec- 
tion in that it allows one to select heads of clauses, even when they are not 
maximal in their clauses. It is interesting to see if this can give us new decision 
procedures for decidable fragments of predicate calculus. So, we have 

Problem 5 Find new decision procedures based on stratified resolution. 

7 Recent Developments 

Recently, Robert Nieuwenhuis proposed a new interesting method for proving 
completeness of stratified resolution. His method gives a positive answer to Prob- 
lem 1: the restriction on selection functions to be subsumption-stable is not 
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necessary. In a way, it also answers to Problem 2, since his technique is rather 
different from ours. He also proposed a more concise definition of stratification 
and clauses admitting stratification in terms of well quasi orderings. 
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Abstract. In a binary tree representation of a binary resolution proof, 
rotating some tree edge reorders two adjacent resolution steps. When 
rotation is not permitted to disturb factoring, and thus does not change 
the size of the tree, it is invertible and defines an equivalence relation on 
proof trees. When one resolution step is performed later than another 
after every sequence of such rotations, we say that resolution supports 
the other. 

For a given ordering on atoms, or on atom occurrences, a support or- 
dered proof orders its resolution steps so that the atoms are resolved 
consistently with the given order without violating the support relation 
between nodes. Any proof, including the smallest proof tree, can be con- 
verted to a support ordered proof by rotations. For a total order, the 
support ordered proof is unique. The support ordered proof is also a 
rank/activity proof where atom occurrences are ranked in the given or- 
der. 

Procedures intermediate between literal ordered resolution and support 
ordered resolution are considered. One of these, 1-weak support ordered 
resolution, allows to resolve on a non-maximal literal only if it is immedi- 
ately followed by both a factoring and a resolution on some greater literal. 
In a constrained experiment where literal ordered resolution solves only 
six of 408 TPTP problems with difficultly between 0.11 and 0.56, 1-weak 
support ordered resolution solves 75. 



1 Introduction 

Automated theorem provers, in their search for a proof, must balance the de- 
ductive power of a calculus, telling what can be derived from a given point in 
the search, with restriction strategies, telling which deductions are to be avoided. 
Clearly the restriction strategy must not remove all of the choices that eventually 
lead to a proof, at least not without the user’s being aware of its incompleteness. 
But even so, the restriction strategy may remove all shortest proofs, leading 
to another undesirable effect: the theorem prover takes longer to hnd a longer 
proof. An ideal restriction strategy would reduce the space to one richly popu- 
lated with only short proofs, be simple to implement and quick to check. This 
is an unrealistic ideal. In this paper we give a reduction strategy is that is quick 
to check, simple to implement, admits smallest proofs trees, and is almost as 
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restrictive as literal ordered resolution, a widely used and successful restriction. 
Literal ordered resolution was used in the CADE-16 System Competition [9] by 
at least Bliksem, Gandalf, HOTTER, SPASS and Vampire. 

In our setting we represent proofs as binary trees, labeled by clauses according 
to Robinson’s resolution met hod [6]. A node is labeled both by a clause, referring 
to the conclusion drawn at this point by the resolution, and if it is not a leaf, 
by the atom that was resolved upon to give this conclusion. Our measure of 
size is the number of nodes in this tree. Often theorem provers build sequences 
of formulae where each deductively follows from previous ones. This sequence 
represents a traversal of a directed acyclic graph (dag) that underlies the tree. 
The size of an underlying dag is a more natural measure of proof size than 
the size of the tree. But often if one dag is smaller than another then the tree 
expanded from the first dag is smaller than the tree expanded from the other. 

Support ordered resolution depends on the notion of support between two 
resolution steps, first defined in [4] and defined for proof trees in [7]. When 
compared with the literal ordered restriction, used by many theorem provers and 
explored in [3], support ordered resolution is less restrictive, in that it admits a 
very specific additional resolution step. On the other hand the support ordered 
restriction does not increase the size of the smallest tree, unlike literal ordered 
resolution which may restrict all smallest trees, and in some cases admit only 
exponentially larger trees and dags, as shown in Example 5 below. 

Rotating some tree edge reorders two adjacent resolution steps. When ro- 
tation is not permitted to disturb factoring, and thus does not change the size 
of the tree, it is invertible and defines an equivalence relation on proof trees. 
Eor a given total order on atoms, there is a single support ordered tree in each 
rotation equivalence class. Since the equivalence classes typically contain an ex- 
ponential number of trees, support ordered resolution substantially reduces the 
search space. 

It is interesting to find a restriction of a resolution calculus that admits 
a smallest deduction (tree) while substantially reducing the search space, as 
support ordered resolution does. It is also interesting in that it brings together 
two apparently different restrictions, the rank/activity restriction [5] and literal 
ordered resolution. A given ordering on atoms can be used to set the ranks of 
literals in each clause, and then the rank/activity proof is the support ordered 
proof. 

Viewed as a generalization of other restrictions, we can identify a number of 
other special cases of support ordered resolution. These suggest themselves as 
candidates for experiments. One set of these experiments has been done for the 
special case called 1-weak support ordered resolution, or 1-wso. This proof format 
depends on a restriction that can be quickly checked on partially closed binary 
resolution trees. Recall that the literal ordered restriction allows a resolution only 
on a maximal atom in each clause; 1-wso allows maximal resolutions and also 
allows a resolution on some non-maximal atom but then requires an immediate 
merge on a greater atom from different parents followed by a resolution step on 
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that merged atom. Our experiments indicate that 1-wso provides deductive / 
reductive tradeoff that is worth taking. 

In the following sections we provide necessary background including the re- 
cent notion of support between nodes in a binary resolution tree[7]. This is 
followed by the introduction of support ordered resolution, the restriction of 
resolution closely related to literal ordered resolution but weakened in those sit- 
uations where it conflicts with the support of one node for another. We also 
describe support ordered resolution as a generalization of both rank/activity 
and literal ordered resolution. This suggests a space of possible strategies. We 
describe one proof procedure in this space, l-weak support ordered resolution 
which is a slight addition to a typical literal ordered resolution theorem prover. 
We also give the results of our experiments. 

2 Background 

A binary tree is a set of nodes and edges, where each edge joins a parent node to 
a child node, and where each node has one child or has zero and is then called 
the root, and each node has two parents or has zero and is then called a leaf. 
The descendant (ancestor) relation is the reflexive, transitive closure of child 
(parent) . 



A cva B — icvevp 



C cievavp D — iev5 



E e:avpv6 



B ^cvevp D ^ev5 



A ^ e:— icvpv5 

C c:avpv5 



Fig. 1. A binary tree rotation 



Given the binary tree fragment T on the left of Figure 1, a rotation is the 
reassignment of edges so that the tree T' on the right of Figure 1 is produced. 
The parent C of E becomes the child of E and the parent B of C becomes the 
parent of E. In other words, the edges [B, C) and [C, E) are replaced by [B, E) 
and [E, C).lf E has a child E in T, then C takes that child in T' , or equivalently 
the edge [E, E) is replaced by [C, E). 

We use standard dehnitions [2] for atom, literal, substitution, uniher and 
most general uniher. A clause is a multiset of literals. The clause C subsumes 
the clause D if there exists a substitution 9 such that C9 C D (as sets, not as 
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multisets). A variable renaming substitution is one in which every replacement 
of a variable maps to another variable, and no two variables map to the same 
variable. Two clauses C and D are equal up to variable renaming if there exists a 
variable renaming substitution 9 such that CO — D (as multisets). Two clauses 
are standardized apart if no variable occurs in both. Given two parent clauses 
Gi Vai V. . .Vam and G2 V-161V. . .y-^bn which are standardized apart (a variable 
renaming substitution may be required) a resolvent is the clause [Ci V €2)0 where 
d is a most general unifier of {ai, . . . , Um, &i, . . . , &«}. The atom resolved upon is 
ai 9 , and the set of resolved literals is {ai, . . . , am, ~'b \, . . . , ~'bn\. 

Definition 1 . A binary resolution tree, or brt on a set S of input elauses is a 
binary tree where eaeh node N in the tree is labeled by a clause label, denoted 
cl [N ) . The clause label of a leaf node is an instance of a clause in S, and the 
clause label of a non-leaf is the resolvent of the clause label of its parents. A 
non-leaf node is also labeled by an atom label, al{N), equal to the atom resolved 
upon. The clause label of the root is called the result of the tree, resultiT). A 
binary resolution tree is closed if its result is the empty clause, □ . 

Our resolution is based on Robinson’s original resolution, which we use to 
define resolution mapping and history path. The resolution mapping tells what 
happens to each literal in a given resolution step, and the history path tells what 
happens to it from the leaf where it is introduced to the node where it is resolved 
away. 

The resolution mapping p at an internal node in a brt maps each resolved 
literal, a\, ..., am, ~'bi, . . . , ~'bn, to the atom resolved upon and maps each un- 
resolved member c of C\ or C2 to the occurrence of c 9 in the resolvent. 

Let the nodes {No , . . . , Nn) occur in a binary resolution tree T such that No 
is a leaf whose clause label contains a literal a, and for each i — 1 , . . .,n, Ni-i 
is a parent of Ni. Let pi be the resolution mapping from the parents of N{ to 
N{. Also let Pi . . .p2PiO’ occur in cl{Ni), so that a is not resolved away at any 
Ni. Suppose Nn either is the root of T, or has a child N such that . . . pi a is 
resolved upon. Then P — {No, ■ ■ ., Nn) is a history path for a. The history path 
is said to close at A if A exists. 

Let T be a binary resolution tree as in Figure 1 with an edge {C, E) between 
internal nodes such that E has a parent C and C has two parents A and B. 
Further, assume that no history path through A closes at E. Then the result of 
a rotation on this edge is the binary resolution tree T' defined by resolving cl{B) 
and cl{D) on al{E) giving cl{E) in T' and then resolving cl{E) with cl {A) on 
al{C) giving cl{C) in T' . Any history path closed at G in T is closed at G in A; 
similarly any history path closed at A in T is closed at A in A. Also, the child 
of A in T, if it exists, is the child of G in AL 

Two trees Ti and T2 are rotation equivalent if T\ is the result of a rotation 
of an edge in T2, or if T\ and T2 are both rotation equivalent to another tree. 

The condition on a rotation that no history path through A closes at A 
is important, because otherwise the rotation would disturb factoring and thus 
more resolutions might be needed. Consider the left tree in Figure 2 . There are 
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history paths for e through both A and B that close at E. The tree after the 
edge rotation, shown on the right side of Figure 2 has an unresolved occurrence 
e in the result. 



A cveva B — .cvevp 



B — icvevp D — iev5 



C c:evevavp D — iev5 



A cveva £■ e:— ,cvPv5 



E e:avpv5 



C c:evavpv5 



Fig. 2. Not a brt rotation 



The calculus of binary resolution trees consists of the following: 

1 A node labeled by (an instance of) an input clause is a brt. 

2 If T is a brt and is a variable substitution then is a brt formed by 
replacing each label I in T hy 19. 

3 Suppose Ti and T 2 are brts and no variable appears in both, Ri and R 2 are 
the clause labels of the roots of Ti and T 2 respectively, and R is the clause 
formed by resolving R\ and R 2 on atom A with substitution 9. Then T is a 
brt formed by creating a new node JV with atom label A, clause label R and 
the roots of Ti9 and T 29 are R’s parents. 

4 If T is a brt and {C, E) is an edge then T' formed by rotating the edge [C, E) 
is a brt (shown in Figure 1). 

Because not all rotations are allowed, sometimes a node A in a brt T remains 
below another node M, under all sequences of rotations. When this occurs, we 
say that N supports M . Support is a transitive relation, and support cannot be 
circular. Although it is not exploited in this paper, support can be determined 
from the history paths of T [7]. 

3 Support Ordered Resolution 

To simplify the discussion we assume that the literal ordering is independent of 
sign, and thus is an atom ordering. The same results can be developed for literal 
ordering. Because we use atom ordering, the internal nodes of a brt are ordered 
in atom ordering on the atom labels of the nodes, i.e. the atom resolved upon 
at that node. 

Definition 2. Given an ordering < of atoms and a binary resolution tree T, we 
say that a node N is support ordered if no deseendant of N has higher order 
than N unless it supports N . T is support ordered if all its nodes are. 
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In effect, support ordering is the lexical composition of the support relation 
and atom ordering. 

Theorem 1. For a given partial ordering < on atoms and a given brt T, there 
is a support ordered proof tree T* that is rotation equivalent to T. If < is total, 
T* is unique. 

Proof We proceed by induction on the size of T. If T has one or three nodes, T 
is trivially support ordered. Suppose T has k nodes. Consider N a node ordered 
highest in ^ not supporting any other node in T. Rotate edges above N so that 
both parents of N are leaves. These rotations are possible because N supports no 
nodes. Let Cre be the clause label of N in the resulting tree Tq. From this tree, 
remove the parents Li and L2 of N, so that is a leaf, and call the resulting 
smaller tree Ti . By induction there exists 2 ^ ; ^ support ordered binary resolution 
tree of k — 2 nodes that is rotation equivalent to Ti . is a leaf of Tf . Construct 
T* by replacing the parents Li and L2 of N in T* so that the resolution done at 
N is the same as in Tq. Because T and T* are rotation equivalent, the support 
relations in T* and T are the same. All nodes in T* that are also in Tf are 
support ordered, with the possible exception of N. But since any descendants of 
N that are ordered higher than N are supports of A, A is support ordered. 

To argue uniqueness, consider for each leaf L the highest ordered descendant 
Dl that supports no other descendants of L. Such a node must exist because 
circular support relations are not possible. Since A is total, is unique for 
L. But Dl must be the child of L in T*; otherwise T* is not support ordered. 
Thus each leaf of T* has a unique child. For any node A inT* , other than the 
root, this argument can be combined with an induction on the height of the tree 
above A to show that A has a unique child. Thus T* is unique. □ 

To be useful, a restriction must be applicable to a partial proof. Also the 
check should require only a simple computation, preferably one with low com- 
plexity (constant or linear time) and require information that is local to the 
proof step. Therefore we define weak support ordered resolution, which can be 
checked quickly, locally, and on a partial proof. Unfortunately the check cannot 
be made with just the partial proof we have so far - more steps must be done. 
We will revisit this problem. 

Definition 3. Given an ordering < and a brt T, an edge between a parent Ai 
and its ehild N2 in a brt is weak support ordered if the Ai is ordered higher than 
N2 or if the edge is not rotatable, T is weak support ordered if every edge of T 



Support ordered resolution is strictly more restrictive than weak support 
ordered, in that it admits strictly fewer proofs. Consider the following brt: The 
nodes on a branch are (A^, Aa, Ac, Af,) where each is a parent of the next. Let 
a be the atom resolved at Aa, and similarly for b, c, d, and let the atom ordering 
dictate that a should be resolved before b, ete. Suppose Aa supports A^ and 
Af, supports Ac, so a must be resolved after d and b after c. Thus according 
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to the support relation, we have only two choices to resolve hrst, d or c, and 
according to the atom order, we must resolve c hrst. The resulting support 
ordered tree is illustrated in the left of Figure 3. All edges in this tree are also 
weak support ordered; the only rotatable edge is (Na,Nc) but it conforms to 
the atom ordering without being rotated. But weak support order applys the 
atom ordering only locally. A rotation equivalent tree, shown on the right of 
Figure 3 has the branch {Nc, N(,, Nd, Na) which is also weak support ordered, 
since the only rotatable edge (Af,, Nd) and it also conforms. Thus wso is a weaker 
restriction than support order. 



avbvcvd bv^c 




Nc c:avbvbvd -^b 




Nh b:awd aw^d 




Nd d: ay a 




avbycvd av-<d 




Nd d: ay ay bye — ,a 






,b 



Fig. 3. A support ordered tree and a weak support ordered tree 



4 Relation to the Rank/ Activity Restrictions 

The rank/activity restriction [5], or r/a, can be stated in terms of history paths 
and where they close. 

Given a rank function r that orders atoms in each clause, a brt T is dehned 
to be r-compliant if for each leaf L of T the literals of L are resolved away 
either in r-order, or in the opposite order only if there is another history path 
that closes with the higher ordered literal’s history path, but does not intersect 
lower ordered one. That is, for each pair of literals in L that close at 

descendants di and d 2 respectively, if r(G) < i’{h) then either (maximal case) 
d '2 is a descendant of di or (non-maximal case) di is a descendant of d 2 and 
some history path that does not intersect the path for G also closes at di. This 
is illustrated in Figure 4. The ovals represent nodes in the brt’s and the lines 
with ground symbols represent history paths and where they close. The tree on 
the left illustrates a maximal resolution, where 1 2 A li so that is maximal 
and should be resolved hrst. The tree on the right is a non-maximal resolution. 
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where is resolved later, only after it shares nodes with another history path 
for l\. Note that the shared node may be (I 2 , and that di may be the child of (I 2 , 
as it will be in the case of 1-wso resolution, below. 



Alternately one can state the r/a restriction in terms of an activity level 
associated with each literal. Initially all literals are active, be. are available for 
resolution. When a literal of a given rank is resolved, all literals of higher rank 
are turned olf, and are turned back on only if merged at this or some later 
resolution step. In the definition above, if is resolved before I 2 , 1 2 is turned olf, 
and then the other history path closing at d 2 is the one that re-activates I 2 . 

Note that in [5] it is the lower ranked literal that gets turned olf. This arbi- 
trary decision was reversed for this paper to be more consistent with the usual 
description of literal ordered resolution. 

Thus the literal ordered resolution restriction is a special case of the r/a 
restriction, in which the ranks of literals in a clause are specified according to an 
ordering on the atoms, and non-maximal resolutions are not used. Since there 
is no reactivation of literals, there is no chance that a resolution step on a non- 
maximal literal will end in a refutation proof - it can never be resolved away. 
Thus a procedure to compute literal ordered resolution proofs needs to consider 
only maximal literals and has much less choice at each step; this leads to a much 
reduced search space, and accounts for the speed and success of such procedures. 
R/A procedures have considerable fan out since non-maximal literals are often 
active as well. 

Yet in terms of proof size, the reduced choice of literal ordered resolution can 
lead to the elimination of all proofs rotation equivalent to smallest proof tree. In 
Example 5 the smallest proof, found when reverse alphabetical ordering of the 
atoms is used, has 15 resolutions, but the proof has 32,767 resolutions when the 
order is alphabetical. This example, for n=4, generalizes to 2” clauses with n 





Fig. 4. A maximally and a non-maximaUy ordered resolution 
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{ a, 61, cll, dm} 
{ a, 61, cll, -idlll} 
{ a, 61, -.cll, dll2} 
{ a, 61, -.cll, -.dll2} 
{ a, -.61, cl2, dl21} 
{ a, -.61, cl2,-.dl21} 
{ a, -.61, -.cl2, dl22} 
{ a, -.61, -.cl2, -.dl22} 
{-.a, 62, c21, d211} 
{-.a, 62, c21,-.d211} 
{-.a, 62, -.c21, d212} 
{-.a, 62, -.c21, -.d212} 
{-.a, -.62, c22, d221} 
{—.a, —.62, c22, — .d221} 
{—.a, —.62, — .c22, d222} 
{—.a, —.62, — .c22, — .d222} 



Fig. 5. 32767 alphabetical a-ordered resolutions are required 



literals each, which can take from 2” — 1 to 2^ ^ — 1 resolutions depending on 

the ordering. 

In terms of deletion strategies, both rank/activity and literal ordered resolu- 
tion retain completeness when used with tautology deletion. In fact, rank/activity 
can also be used with the regular and the surgery-minimal restrictions [7] which 
are strictly more restrictive. Literal ordered resolution with the surgery-minimal 
restriction is not complete. 

Subsumption deletion works well with literal ordered resolution, but only a 
weakened form works with rank/activity - if a clause with some inactive literals 
is used as the subsuming clause, it must have all of its literals reactivated. This 
can be seen just by observing that otherwise the subsuming clause would not 
be able to draw a non-strictly stronger conclusion in cases where the subsumed 
clause could. 

With respect to the ordering on literals, the r/a restriction does not require 
that the ordering (or rank) of literals in a clause is consistent with an overall 
literal ordering. In fact the ordering in r/a is applied on the literal occurrences 
and identical literals occuring in different clauses need not be ranked the same. 
Thus the ordering can be total without being liftable. An ordering is said to be 
liftable if a A 6 iff a(f A 6(1 for all substitutions d. In literal ordered resolution, 
the orderings must be liftable to maintain completeness. Since liftable orderings 
are often are not total (but see [3]), in these cases the restriction cannot choose 
a unique maximal literal, leading to fan out in the search space. 

Lock resolution [1], incidentally, is closely related. In lock resolution, as in 
rank/activity, the ranks are assigned to literal occurrences, chosen in any or- 
der, not according to an overall literal ordering. Like literal ordered resolution. 
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lock resolution uses only the maximal case of the resolutions in Figure 4. Un- 
fortunately lock resolution does not retain completeness with either tautology 
deletion or subsumption. 




Fig. 6. A space of restrictions. The horizontal axis ranges from allowing only maximal 
resolutions to allowing non-maximal supported resolutions. The vertical axis indicates 
that ordering is defined on hteral occurrences vs. defined on the hteral set 



From this discussion we imagine a space of strategies, shown in Figure 6, 
where the systems on the top all depend on orderings that can not in general be 
liftable, thus are not total, and those on the bottom depend on arbitrary ranks 
within a clause, which can be total. Those on the left depend only on the maximal 
case of the resolutions in Figure 4, and those on the right, support ordered and 
rank/activity, use both cases. The system labelled 1-wso is one system in this 
space, described in the next section. With 1-wso, a very restrictive form of the 
non-maximal case is permitted: a non-maximal atom may be resolved on, but 
then a merge and a resolution on a greater atom must follow immediately. 

5 Building 1-Weak Support Ordered Proofs 

One should not construct support ordered proofs directly from the definition, 
since the support part of the restriction cannot be determined until the tree is 
completed. The rank/activity calculus with the ranks set according to the literal 
or atom ordering should be used instead. It then computes support ordered 
proofs. Even so, the number of allowable deductions at each stage may be too 
high, especially in the early stages when all literals are active. 

On the other hand the advantages of r/a may be useful: preservation of the 
smallest proof and compatibility with the surgery-minimal restriction. 

The 1-weak support ordered resolution keeps some of the advantages of both. 
It uses the maximal case and a restricted form of the non-maximal case, from 
Figure 4. Because the maximal case is snihcient to ensure completeness of the 
procedure, we are free to further restrict the non-maximal case. In the 1-wso non- 
maximal case, a non-maximal atom may be resolved but only if some greater 



Support Ordered Resolution 395 



atoms occur in each parent and those greater atoms can be merged or factored. 
Then the factoring is performed, even if a substitution is required. Moreover, the 
resulting clause is forced to resolve in a conventional way on this factored literal, 
whether or not it is maximal in the clause and whether or not a non-maximal 
resolution would otherwise be permitted. 

We dehned wso resolution to make a slightly modihed support condition 
which was easier to compute. Although weak support ordered resolution looked 
promising, we noted the restriction could not be applied to a partial proof with- 
out knowing more information. Specihcally when a resolution on a non-maximal 
literal is made, it is not immediately knowable that this new node will eventually 
be supported by the resolution on a greater literal that has been deferred. This 
guarantee is now made by 1-wso by forcing such a supporting resolution to be 
the next step. One can see that this new node supports the non-maximal one 
because the rotation between these two nodes would disturb the merge, as in 
Figure 2. We need to refer the reader to [7] for the proof that no other sequence 
of rotations could somehow invert these nodes. 

Because the distance from the resolution producing the merge or factor is 
one node away from the resolution on the merged literal, we call the resulting 
system l-weak support ordered resolution. 

There is a surprising distinction betweem wso and Iwso. While they are 
closely related, neither is a restriction of the other. In Figure 7, the tree on the 
left is wso but not 1-wso. It is not 1-wso because the non-maximal resolution 
on b does not end up merging a greater literal. It is wso because of the two 
possible rotations, the upper one would unorder the nodes and the lower one 
would disturb a merge on a. The tree on the right of Figure 7 is 1-wso but not 
wso. It is 1-wso because after the non-maximal resolution on c, the resolution 
on the greater, merged atom b follows immediately. But it is not wso, because 
the lower resolutions should be inverted, according to the atom ordering. In 
Figure 7 wso and support order correspond, so the tree on the left is support 
ordered while the tree on the right is not. 




Fig. 7. A wso and a 1-wso tree 
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One of the design decisions of 1-wso was to prevent a great increase in the 
number of choices beyond literal ordered resolution. The rather complex condi- 
tion for a non-maximal resolution, and the identification of the forced literal for 
the next resolution help to prevent such an increase. So 1-wso resembles ordered 
resolution in that only one literal in each clause is available for maximal resolu- 
tion. Contrast this with rank/activity which exactly computes support ordered 
resolution but has all literals available for resolution initially, so it must rely on 
other heuristics to limit the search. 

Rank/activity guarantees that some shortest proof tree is in its search space. 
1-wso does not, but it does guarantee that some tree in its search space is non- 
strictly smaller than the smallest one in the search space for support ordered 
resolution. For the problem in Figure 5, a 1-wso tree of size 1511 nodes is found 
by the theorem prover described below. While this is much greater than the 
minimal 15, it is better that 32767. There may be a smaller 1-wso tree for this 
example. 

The 1-wso system inherits full subsumption from literal ordered resolution 
between clauses without forced literals. We may include in this any clause with 
a forced literal that happens to be maximal, which is the most common case. If 
a clause has a forced literal, and its other literals are subsumed by some clause 
C with no forced literals, then the subsumption should also be allowed; any 
clause generated when the forced literal is resolved upon will also be subsumed 
by ( 7 . A clause with a forced literal can also be subsumed by a clause with a 
forced literal if the forced literals are not dilferent, i.e. if they correspond in 
the subsumption’s subset test. The subsumption is not allowed if a clause with a 
forced non-maximal literal tries to subsume a clause with a dilferent literal either 
forced or maximal, because the candidate subsuming clause is not adequate to 
be used in place of the candidate subsumed clause. 

6 Experiments 

The experiments were conducted on a 400MHz Pentium II with 64 Mb RAM 
running RedHat Linux 6.0 with a theorem prover called pBliksem written by 
the first author in about 5000 lines of Prolog. The experiments depended on 
SWI-Prolog 3.2.8. pBliksem borrows heavily from the design of Bliksem, but 
as it is written in Prolog it has no claims to Bliksem ’s speed. Nevertheless it 
is possible to overhaul the underlying data structures in a matter of minutes 
or hours, and to experiment with unconventional resolution steps. Written in 
Prolog it has a degree of dependability and clarity. It relies mainly on forward 
calls, instead of backtracking, so it makes heavy demands on Prolog’s garbage 
collection and tail recursive optimization. Overall it can manage about 5000 
inferences in ten minutes, but seldom manages significantly more inferences, 
because of memory usage. The inference rate varies from 150 inferences per 
second initially, to under ten per second after ten minutes. While this is clearly 
not a competitive system, there is nothing to suggest that the results we have 
obtained with this experimental system would not also be obtained by a better 
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theorem prover. In particular the inferences it chooses are almost identical to 
those chosen by Bliksem. Occasionally the choices to be made are not determined 
by the settings of the flags and parameters, and in these tied cases the systems 
may chose differently. 

The experiments were performed on about 80% of the problems in TPTP- 
v2.3.0 [8] with dilhculties in the range strictly above zero and strictly below 0.6. 
We selected 408 of the 528 such problems. It was conhgured to use Bliksem’s 
liftable literal order, in which lexicographical ordering is used, but literals that 
are lexically identical up to a variable are not comparable. The clause ordering 
depends hrst on the complexity of the clause, which is the sum of the number of 
function symbols, predicate symbols and variables. If the complexity of clauses is 
identical then the sizes of the underlying brts are compared. Two conhgurations 
of the theorem prover were tried: literal ordered resolution and 1-wso resolution. 
Ten minutes of computing time was given to each problem. 

Literal ordered resolution solved only six problems, where as 1-wso solved 
75, including the six. 1-wso used about 10% more time than literal ordered 
resolution to solve those six. The numbers of non-maximal resolutions used are 
given in Table 1. This table shows that non-maximal resolutions were used in 
most proofs, except the six ones solved by the literal ordered resolution system. 
Most did not require many non-maximal resolutions but a few did: BOO035-1 
required 20 and SYN074-1 required 19. 



Number of 

non-maximal resolutions] 
in the tree 


Number of 
solved 


Percentage 


0 


6 


8% 


1 


32 


43% 


2 


14 


19% 


3 


11 


15% 


>3 


12 


16% 



Table 1. Counting non-maximal resolutions used by 1-wso 



The 1-wso program did best on the problems with dilhculty between 0.2 and 
0.3 and worst on the problems between 0.3 and 0.4, as shown by Table 2. Both 
above and below these ranges almost 20% percent of the problems were solved, 
indicating that the correlation between the dilhculty measure and the success of 
the algorithm is rather small. Perhaps this is because the dilhculty of a theorem 
is a very hard property to measure. 

The times taken to solve the problems, reported in Table 3, were scattered 
somewhat logarithmically with more toward the upper end of the time allocated. 
Each factor of two allowed about 10 more problems to be solved. 

Table 4 shows the number of inferences generated by the prover, which would 
not be affected by the inefficiency of our implementation. The effect of allowing 
more inferences is similar to allowing more time, but there is a slightly larger 
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Difficulty d 


Number of Problems 


Solved 


% 


0.1 < d < 0.2 


66 


12 


18% 


0.2 < d < 0.3 


84 


32 


38% 


0.3 < d < 0.4 


170 


15 


9% 


0.4 < d < 0.5 


48 


8 


17% 


0.5 < d < 0.6 


40 


8 


20% 




Total 408 







Table 2. Difficulty of problems and number solved by 1-wso 



Time range (seconds) 


Number of problems 


0 < t < 1 


13 


1 < t < 2 


4 


2 < t < 4 


2 


4 < t < 8 


1 


8 < t < 16 


11 


16 < t < 32 


2 


32 < t < 64 


6 


64 < t < 128 


13 


128 < t < 256 


9 


256 < t < 512 


12 


512 < t < 600 


2 



Table 3. Times taken to solve problems 



shift to the upper end of the range. Each factor of two allowed about 12 more 
problems to be solved. 



Inferences Generated g 


Number of problems 


0 < g < 10 


1 


10 < g < 20 


0 


20 < g < 40 


2 


40 < g < 80 


9 


80 < g < 160 


7 


160 < g < 320 


6 


320 <g < 640 


11 


640 < g < 1280 


11 


1280 <g < 2560 


16 


2560 <g < 5120 


11 


5120 <g< 5357 


1 



Table 4. Number of inferences required to solve problems 
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Table 5 shows the number of problems of each category solved. The 1-wso 
procedure seems to be much better at some categories than others. Note that 
there is no special treatment of the equality literal in our implementation. 



Category 


Number of problems 


Solved 


Percent 


BOO 


5 


1 


20% 


CAT 


11 


0 


0% 


COL 


15 


2 


13% 


FLD 


26 


0 


0% 


GEO 


57 


19 


33% 


GRP 


20 


7 


35% 


HEN 


5 


5 


100% 


LCL 


33 


23 


70% 


LDA 


6 


1 


17% 


NUM 


3 


1 


33% 


PLA 


15 


0 


0% 


PUZ 


1 


0 


0% 


RNG 


6 


0 


0% 


ROB 


1 


0 


0% 


SET 


113 


11 


10% 


SYN 


91 


15 


5% 




408 


75 


18% 



Table 5. Problems solved in each category 



7 Conclusions 

A good restriction strategy in an automated theorem prover depends balanc- 
ing the economy of choice with the economy of cuts. Literal ordered resolution 
strongly limits choices but cuts away many proofs, sometimes leaving only very 
big proofs. Support ordered resolution, or rank/activity, allows more choices, 
and cuts away all but one of a set of rotation equivalent proofs, and leaves a 
smallest proof tree, l-weak support ordered resolution is a step away from lit- 
eral ordered resolution toward support ordered resolution, allowing very specihc 
extra choices. It does not eliminate all the redundancy of rotation equivalence, 
nor does it preserve the smallest proof, but it is guaranteed to hud non-strictly 
smaller proofs than literal ordered resolution, occasionally hnding much smaller 
ones. Experiments indicate that the balance here is in favor of increasing choice 
beyond literal ordered resolution to increase the number of possible proofs. This 
leaves open the question whether and how to move toward support ordered res- 
olution, allowing modestly more choices while leaving more and smaller proofs 
in the search space. 
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Abstract. IVY is a verihed theorem prover for Hrst-order logic with 
equahty. It is coded in ACL2, and it makes calls to the theorem prover 
Otter to search for proofs and to the program MACE to search for coun- 
termodels. Verihcations of Otter and MACE are not practical because 
they are coded in C. Instead, Otter and MACE give detailed proofs and 
models that are checked by verihed ACL2 programs. In addition, the 
initial conversion to clause form is done by verihed ACL2 code. The 
verihcation is done with respect to hnite interpretations. 



1 Introduction 

Our theorem provers Otter [6,7,10] and EQP [4,8] and our model searcher 
MACE [3,5] are being used for practical work in several areas. Therefore, we 
wish to have very high confidence that the proofs and models they produce are 
correct. However, these are high-performance programs, coded in C, with many 
tricks, hacks, and optimizations, so formal verification of the programs is not 
practical. 

Instead, our approach is to have the C programs give their results explicitly 
as detailed proof objects or models, and to have separate checker programs check 
the results. The checker algorithms are relatively simple and straightforward, so 
it is practical to apply program verification techniques to them. In particular, we 
use the ACL2 program verification system to prove that if the checker program 
accepts a proof, then the proof is correct. 

Otter can convert first order formulas into clauses (by normal form transla- 
tion and Skolemization) , but it is not able to include these preprocessing steps 
as part of the proof objects. Therefore, we have recoded the clause form trans- 
lator in ACL2 and proved its soundness directly. The result is a hybrid system, 
named IVY, that (1) is driven by ACL2 code, (2) calls ACL2 functions for the 
preprocessing, (3) calls an external program to search for a proof or a model, 

* This work was supported by the Mathematical, Information, and Computational Sci- 
ences Division subprogram of the Office of Advanced Scientihc Computing Research, 
U.S. Department of Energy, under Contract W-31-109-Eng-38. 

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 401-405, 2000. 

Springer- Verlag Berlin Heidelberg 2000 




402 William McCmie and Olga Shumsky 



and (4) calls ACL2 checker functions to check the results. The top-level sound- 
ness theorems have the form: If IVY claims a proof then the input formula is a 
theorem. A weakness of the verihcation method is that the soundness proofs are 
with respect to hnite interpretations. In Section 6 we discuss an approach for all 
interpretations. 

ACL2 (A Computational Logic for Applicative Common Lisp) [2,1], is the 
successor to the Boyer-Moore theorem prover. ACL2 is a specihcat ion/program- 
ming language, based on Common Lisp, together with an environment for prov- 
ing theorems about the programs. Its strength is automated support for proving 
inductive theorems about recursively dehned programs. 

2 Specification of the Logic 

We use ACL2 to dehne a hrst-order logic, and this becomes the specihcation 
for our verihcation. The dehnitions of well-formed term and well-formed formula 
are straightforward. We next dehne the semantics of our logic by dehning inter- 
pretation of a hrst-order language. This part is nonstandard, because we restrict 
ourselves to hnite interpretations; see Section 6. Finally, we dehne evaluation 
of a formula in an interpretation. The evaluation function is a pair of mutually 
recursive functions, in which one recurses through the structure of formulas, and 
the other (called for quantihed formulas) recurses through the elements of the 
domain of the interpretation. In particular, the function (FEVAL FI) evaluates 
formula F in interpretation I. 

3 The Proof Procedure 

The proof search procedure is standard for hrst-order resolution/paramodulation 
theorem provers. Starting with the negation of a conjecture, we (1) convert 
to negation-normal form, (2) rename bound variables, (3) Skolemize, (4) move 
universal quantihers to the top, (5) convert to conjunctive normal form, (6) 
search for a refutation (or a model), and (7) check the refutation (or model). 

Steps 1, 2, 4, and 5 produce an equivalent formula, and Skolemization pro- 
duces an equiconsistent formula. 

In IVY, the preprocessing steps (1-5) are coded in ACL2, the search step (6) 
is accomplished by calling Otter or MACE, and the checker step (7) is coded in 
ACL2. 

4 Soundness Theorems 

The function to convert formulas to negation-normal form is (NMF F ) , and the 
soundness theorem states that NMF produces an equivalent formula: 

(EQUAL (FEVAL (MMF F) I) 

(FEVAL FI)). 
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The soundness theorems for steps 2, 4, and 5 of the proof procedure are similar. 
The soundness theorem for Skolemization is more complicated, because we have 
to extend the interpretation with the new Skolem symbols: 

(EQUAL (FEVAL (SKOLEMIZE F) (SKOLEMIZE-EXTEND F I)) 

(FEVAL FI)). 

Steps 6 and 7 of the proof procedure are combined in an ACL2 function 
(REFUTE-N-CHECK F) which calls Otter (see Sec. 5) and the checker function. If 
Otter hnds a refutation, and if the checker accepts the refutation, 
REFUTE-N-CHECK returns FALSE (the contradictory formula of our logic); oth- 
erwise REFUTE-N-CHECK returns the input formula F. Hence, it always produces 
an equivalent formula, and the soundness theorem is 

(EQUAL (FEVAL (REFUTE-N-CHECK F) I) 

(FEVAL FI)). 

All of the preprocessing functions, REFUTE-N-CHECK, and a few other func- 
tions are composed into a top-level function (PROVED F ) , which takes the positive 
form of a conjecture, checks that it is well-formed and closed, negates it, and 
applies the proof procedure. The top-level soundness theorem is 

(IMPLIES (PROVED F) 

(AND (¥FF F) 

(NOT (FREE-VARS F)) 

(FEVAL F I))) . 

In other words, if IVY claims a proof of a conjecture F, then F is a closed well- 
formed formula that is true in all (hnite) interpretations. Of course, to accept 
this theorem, a user must accept our ACL2 dehnition of hrst-order logic and the 
soundness of the ACL2 system. But the point is that the user doesn’t have to 
trust Otter, which does the hard part of the work. 

The other side of the problem, searching for countermodels, is easier because 
checking a claimed model produced by the C program MACE is done by simply 
evaluating the negation of the conjecture in the claimed model. The top-level 
function (COUNTERMODEL F) is analogous to (PROVED F): it checks that the con- 
jecture F is closed and well formed, negates it, preprocesses it, calls MACE to 
search for a hnite model, and checks that the negation of F is true in any model 
found by MACE. The soundness theorem for (COUNTERMODEL F) is nearly trivial, 
because the evaluation property we need to prove is checked by COUNTERMODEL: 

(IMPLIES (COUNTERMODEL F) 

(AMD (¥FF F) 

(NOT (FREE-VARS F)) 

(NOT (FEVAL F (COUNTERMODEL F))))). 

In other words, if IVY claims a countermodel to a conjecture F, then F is a closed 
well-formed formula that is false in some interpretation. 
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5 Interface to the C Code 

The function REFUTE-N-CHECK takes the universal closure of a conjunction of 
clauses and returns an equivalent formula. First it transforms the input for- 
mula into an initial proof object. Next it calls the function EXTERNAL-PROVER 
which augments the initial proof object with additional steps that represent some 
derivation (a derivation of the empty clause if we are lucky) . Then it checks that 
each step of the proof object follows from preceding steps. 

In the ACL2 environment, EXTERNAL-PROVER is a defstub, that is, we tell 
ACL2 that it exists but that we don’t know any other properties of it. We 
use ACL2 to prove properties of REFUTE-N-CHECK (e.g., soundness), but these 
properties are necessarily independent of EXTERNAL-PROVER. 

At run time, a Common Lisp function EXTERNAL-PROVER is loaded along with 
the ACL2 code, and the Common Lisp version of EXTERNAL-PROVER overrides 
the ACL2 defstub.^ The Common Lisp version of EXTERNAL-PROVER contains 
operating system calls to build an input hie for Otter, run Otter, and read 
and process Otter’s output. If the Common Lisp version of EXTERNAL-PROVER 
returns a proof object that is not well formed or is unsound, the check fails, and 
REFUTE-N-CHECK returns its input. 

A similar situation holds when searching for a countermodel with MACE. A 
defstub EXTERNAL-MODELER is used in the ACL2 environment when dehning func- 
tions and proving properties, and a Common Lisp version of EXTERNAL-MODELER, 
which calls MACE, is loaded at run time. 

It is possible to use the preprocessing and proof checking functions of IVY 
with other hrst-order resolution/paramodulation provers and model searchers, 
provided they produce appropriate proof objects or models. (The format for 
proof object can be found in [9].) This can be accomplished by simply rewriting 
the Common Lisp version of the EXTERNAL-PROVER or EXTERNAL-MODELER to call 
the desired program. 

6 The Finite Domain Assumption 

Our approach of proving soundness with respect to hnite interpretations is cer- 
tainly questionable. Consider, for example, the sentence 

(IMP (ALL X (ALL Y (IMP (= (F X) (F Y)) 

(- X Y)))) 

(ALL X (EXISTS Y (= (F Y) X)))), 

that is, one-to-one functions are onto. It is not valid, but it is true for hnite 
domains. Could IVY claim to have a proof of such a nontheorem? 

We conjecture that it could not — that the weakness is in the metaproof 
method rather than the hrst-order proof procedure. Nonetheless, we are pur- 
suing the following general approach that covers inhnite interpretations. 

^ According to the ACL2 designers, having an ACL2 function call a Common Lisp 
function in this way is not officially endorsed, but it is acceptable in this situation. 
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ACL2 has an encapsulation feature that allows it to reason safely about in- 
completely specihed functions. We believe we can use encapsulation to abstract 
the hniteness.^ In our current specihcation, the important way in which hnite- 
ness enters the picture is by the dehnition of FEVAL-D, which recurses through 
the domain. This function, in effect, expands universally quantihed formulas into 
conjunctions and existentially quantihed formulas into disjunctions. Instead of 
FEVAL-D, we can consider a constrained function that chooses an element of the 
domain, if possible, that makes a formula true. When evaluating an existen- 
tially quantihed formula, we substitute the chosen element for the existentially 
quantihed variable and continue evaluating. (Evaluation of universally quantihed 
variables requires some hddling with negation.) However, proving the soundness 
of Skolemization may present complications in this approach. 

7 Performance and Availability 

Aside from the overhead of starting up ACL2, the performance of IVY is essen- 
tially the same as the performance of Otter’s autonomous mode or MACE with 
its default settings. IVY cannot yet accept parameters to be passed to Otter or 
MACE. The latest version of IVY is available from 

http://www.mcs.anl.gov/~mccune/ivy. A more complete paper on IVY can 
be found in [9]. 
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Abstract. SystemOnTPTP is a WWW interface that allows an ATP 
problem to be easily and quickly submitted in various ways to a range 
of ATP systems. The interface uses a suite of currently available ATP 
systems. The interface allows the problem to be selected from the 
TPTP library or for a problem written in TPTP syntax to be provided 
by the user. The problem may be submitted to one or more of the ATP 
systems in sequence, or may be submitted via the SSCPA interface to 
multiple systems in parallel. SystemOnTPTP also can provide system 
recommendations for a problem. 



SystemOnTPTP is a WWW interface that allows an ATP problem to be easily and 
quickly submitted in various ways to a range of ATP systems. The interface uses a 
suite of currently available ATP systems, which are maintained in a database 
structure. The interface is generated directly from the database, and thus is as current 
as the database. The interface is a single WWW page, in three parts: the problem 
specification part, the mode specification part, and the system selection part. A user 

Fig. 1. Problem specification 
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specifies the problem first, optionally selects systems for use, and then specifies the 
mode of use. 

Figure 1 shows the problem specification part of the WWW page. There are three 
ways that the problem can be specified. First, a problem in the TPTP library [SS98] 
can be specified by name. The interface provides a tool for browsing the TPTP 
problems if desired. Second, a file, containing a problem written in TPTP syntax, on 
the same computer as the WWW browser can be specified for uploading. The 
interface provides an option for browsing the local disk and selecting the file. Third, 
the problem formulae can be provided directly in a text window. Links to example 
TPTP files are provided to remind the user of the TPTP syntax if required. 

Figure 2 shows the start of the system selection part of the interface. There is one 
line for each system in the suite, indicating the system name and version, a default 
CPU time limit, the default tptp2X transformations for the system, the default tptp2X 
format for the system, and the default command line for the system. To select a 
system the user selects the corresponding tickbox. The default values in the text fields 
can be modified, if required. 

Fig. 2. System selection 



ATP System 

□ Blifceml.lOA 

□ CoDeS.OO 

□ E0.51 

□ FDP0.9 

□ Fiesta 2 

□ GLiDeSO.O 



Time Limit Transfoim 

|100 seconds |rm_equality ; rstip 

|100 seconds [none 

jlOO seconds |rtn_equality : rstip 

|l00 seconds [none 

jlOO seconds jrm_equ;ality : rstfp 

jlOO seconds [none ,rm_eqviality ; stf; 



Foimat 



bliksem 



code 



tptp 



protein 



dedam 



glides 



Command 

Ibliksem %s 



codes -FE -c- -d- -e- 



eprover — tptp-forma- 



fdp-casc ^s t^d 



fiesta-wrapper S^s 



&LiDeS %s 



Figure 3 shows the mode specification part of the interface. The lefthand side 
contains information and the RecommendSystems button for obtaining system 
recommendations for the specified problem. System recommendations are generated 



Fig. 3. Mode specification 



System Recommendations SolTe the Problem at JCU 

System recommendations are free. The Solution attempts axe limited to users vith passwords plus a fev users ■vitiiout 
recommendations are based on the passwords. If you submit vitiiout a valid password yom job may be rejected if 

systems' results for the TPTP. There is the machine is too busy, 
no guarantee that they axe the best Password: I 

systems for this particular problem. * 



Note; Submitting your ATP system's Output mode 

latest performance data is the best vay to 


Parallel mode 


[300 seconds 


get it recommended , Q Quiet 


0 Selected 


All selected systems 


^ Interesting 


O Naive . . . 


[3 systems 


O Verbose 


O SSCPA... 

^ Eager SSCPA... 


1 RecommendSystems | j RunSelected Systems | 


1 RunParallel I 





Our server does not ou^ut results until all tasks are completed . B e patient vhile tiie pro vers do ti^ir tiling . 
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as follows; ATP problems have been classified into 14 disjoint specialist problem 
classes (SPCs) according to problem characteristics such as effective order, use of 
equality, and syntactic form. In a once off analysis phase for each SPC, performance 
data for ATP systems (not necessarily all in the suite), for some carefully selected 
TPTP problems in the SPC, is analyzed. Systems that solve a subset of the problems 
solved by another system are discarded. The remaining systems are recorded as 
recommended, in order of the number of problems solved in the SPC. Later at run 
time, when system recommendations for a specified problem (not necessary one of 
those used in the analysis phase, or not even from the TPTP) are requested, the 
problem is classified into its SPC and the corresponding system recommendations are 
returned. 

The righthand side of the mode specification part of the interface gives 
information, options, and the submit buttons for using the ATP systems sequentially 
or in parallel. When a problem is submitted using either of these submit buttons, the 
ATP systems are executed on a server at James Cook University. Due to resource 
restrictions, only one public user may submit at a time. A password field is provided 
that allows privileged users to submit at any time. The Output mode options specify 
how much output is returned during processing. In Quiet mode only the final result is 
returned, giving the type of result and the time taken. The result is either that a proof 
was found, that it was established that no proof can be found, or that the systems gave 
up trying for an unknown reason. In Interesting mode information about the progress of 
the submission is returned; see below for an example. In Verbose mode a full trace of 
the submission is returned, including the problem in TPTP format, the problem after 
transformation and formatting for the systems, and all standard output produced by 
the systems. The RunSelectedSystems button sequentially gives the specified problem 
to each of the systems selected in the system selection part of the interface. For each 
selected system, the problem is transformed and formatted using the tptp2X utility as 
specified in the system selection. The transformed and formatted problem is given to 
the system using the specifed command line, with a CPU time limit as specified for 
the system. 

The Parallel mode options specify the type of parallelism to use when a problem is 
submitted using the RunParallel button. All of the parallel modes perform competition 
parallelism [SS94], i.e., multiple systems are run in parallel on the machine (using 
UNIX multitasking if there are less available CPUs than systems) and when any one 
gets a deciding result all of the systems are killed. The differences between the modes 
are which systems are used and the individual time limits imposed on each system’s 
execution. A limit on the total CPU time that can be taken by the executing systems is 
specified in the seconds field of the interface. In Naive selected mode all of the systems 
selected in the system selection part of the interface are run in parallel with equal 
CPU time limits (the appropriate fraction of the total time limit). In Naive mode the 
specified number of systems, taken in alphabetical order from the selection list, are run 
in parallel with equal time limits. In SSCPA mode the system recommendation 
component is used to get system recommendations for the specified problem. The 
suite of systems is then checked for versions of the recommended systems, in order of 
recommendation, until the specified number of systems have been found, or the 
recommendations are exhausted. The systems are then run in parallel with equal time 
limits. In Eager SSCPA mode the systems used are the same as for SSCPA mode, but 
the individual system time limits are calculated by repeatedly dividing the total time 
limit by two, and allocating the values to the systems in order. In this manner the 
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highest recommended system gets half the total time limit, the next system gets a 
quarter, and so on, with the last two systems getting equal time limits. The 
motivations and effects of these parallel modes are discussed in [SS99]. The 
effectiveness of SSCPA was demonstrated in the CADE- 16 ATP System Competition 
[SutOO], 

Figure 4 shows the interesting output from submission of the TPTP problem 
CID003-1 in the Eager SSCPA mode. The execution is first transferred from the WWW 
server onto a SUN workstation where the ATP systems are installed. The system 
recommendation component is then invoked. The problem is identified as being real 
1st order, having some equality, in CNF, and Horn. Five systems are recommended 
for the SPC: E 0.32, E-SETHEO 99csp, Vampire 0.0, OtterMACE 437, and Gandalf 



Fig. 4. Interesting output for CID003-1 in Eager SSCPA mode 




c-l.Od. The suite of systems is then checked for versions of these systems, and it is 
found that versions of four of them, E 0.51, Vampire 0.0, OtterMACE 437, and 
Gandalf c-l.Od, are in the suite. The submission required three systems, so E 0.51, 
Vampire 0.0, and OtterMACE 437 are used. The individual system time limits out of 
the total specified limit of 300 seconds are then computed, 150 seconds for E 0.51 and 
75 seconds each for Vampire 0.0 and OtterMACE 437. The problem is then 
transformed and formatted for each of the systems, and the systems are run in parallel. 
E 0.51 finds a proof after 39.8 seconds CPU time, 62.2 seconds wall clock time, at 
which stage all of the systems are killed. 
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SystemOnTPTP is implemented by perl scripts that generate the WWW interface 
from the system database, accept the submission from the browser, extract the system 
recommendations, invoke the tptp2X utility to transform and reformat the problem, 
and control the execution of the systems. The interface is available at: 

http : //www . cs . j cu . edu .au/cgi-bin/ tptp/SystemOnTPTPFormMaker 

SystemOnTPTP makes it easy for users to easily and quickly submit a problem in 
TPTP syntax to an appropriate ATP system. The user is absolved of the 
responsibilities and chores of selecting systems to use, installing the systems, 
transforming and formatting the problem for the systems, and controlling their 
execution. This user friendly environment is particularly appropriate for ATP system 
users who want to focus on the problem content rather than the mechanisms of ATP. 
The interface is not designed for, and is therefore not suitable for, users who wish to 
submit a batch of problems to a particular ATP system. Such users should obviously 
install that ATP system on their own computer, which would also allow use of the 
system’s own input format rather than the TPTP format. ATP system developers are 
invited to submit their systems and performance data for inclusion and use in the 
interface. 
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Introduction 

PTTP+GLiDeS is a semantically guided linear deduction theorem prover, built 
from PTTP [9] and MAGE [7]. It takes problems in clause normal form (GNF), 
generates semantic information about the clauses, and then uses the semantic 
information to guide its search for a proof. 

In the last decade there has been some work done in the area of semantic 
guidance, in a variety of first order theorem proving paradigms: SGOTT [8] is 
based on OTTER and is a forward chaining resolution system; GLIN-S [3] uses 
hyperlinking; RAMGS [2] uses constrained clauses to allow it to search for proofs 
and models simultaneously; and SGLD [11] is a chain format linear deduction 
system based on Graph Gonstruction. Of these, GLIN-S and SGLD need to 
be supplied with semantics by the user. SGOTT uses FINDER [8] to generate 
models, and RAMGS generates its own models. 



The Semantic Guidance 

PTTP+GLiDeS uses a semantic pruning strategy that is based upon the strat- 
egy that can be applied to linear-input deductions. In a completed linear-input 
refutation, all centre clauses are FALSE in all models of the side clauses. This 
leads to a semantic pruning strategy that, at every stage of a linear-input deduc- 
tion, requires all centre clauses in the deduction so far to be FALSE in a model 
of the side clauses. To implement this strategy it is necessary to know which 
are the potential side clauses, so that a model can be built. A simple possibil- 
ity is to choose a negative top clause from a set of Horn clauses, in which case 
the mixed clauses are the potential side clauses. More sensitive analysis is also 
possible [4,10]. Linear-input deduction and this pruning strategy are complete 
only for Horn clauses. Unfortunately, the extension of this pruning strategy to 
linear deduction, which is also complete for non-Horn clauses, is not direct. The 
possibility of ancestor resolutions means that centre clauses may be TRUE in a 
model of the side clauses. 

In PTTP+GLiDeS, rather than placing a constraint on entire centre clauses, 
a semantic constraint is placed on certain literals of the centre clauses: The 
input clauses other than the chosen top clause of a linear deduction are named 
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the model clauses. In a completed linear refutation, all centre clause literals that 
have resolved against input clause literals are required to be FALSE in a model of 
the model clauses. TRUE centre clause literals must be resolved against ancestor 
clause literals. 

PTTP+GLiDeS implements linear deduction using the Model Elimination [6] 
(ME) paradigm. ME uses a chain format, where a chain is an ordered list of A- 
and B-literals. The disjunction of the B-literals is the clause represented by 
the chain. Input chains are generated from the input clauses and are composed 
entirely of B-literals. The chains that form the linear sequence are called the 
centre chains. A-literals are used in centre chains to record information about 
ancestor clause in the deduction. The input chains that are resolved against 
centre chains are called side chains. 

PTTP-|-GLiDeS maintains a list of all the A-literals created throughout the 
entire deduction. This list is called the A-list. The pruning strategy requires that 
at every stage of the deduction, there must exist at least one ground instance of 
the A-list that is FALSE in a model of the model clauses. The result is that only 
FALSE B-literals are extended upon, and TRUE B-literals must reduce. Figure 1 
shows an example of a PTTP-|-GLiDeS refutation. The problem clauses are 
{^moneyV tickets{buy) , ^tickets{sell) V money, money \J tickets{X) , ^moneyV 
^tickets{X)} . The clause ^money V ^Uckets(X) is chosen to form the top 
chain, so that the other three clauses are the model clauses. The model M is 
{money, tickets{buy), ^tickets{sell)}. The A-list is shown in braces under the 
centre chains. 

Since the work described in [1], PTTP-|-GLiDeS has been enhanced to order 
the use of side chains, using the model of the model clauses. The model is used 
to give a score to each clause as follows: If there are N ground domain instances 
of a clause C with k literals, then for each literal L, ul is the number of TRUE 
instances of L within the N ground instances. L is given the score The score 
for the clause C is nL- The clause set is then ordered in descending 

order of scores. This gives preference to clauses that have more TRUE literal 
instances in the model. The use of these clauses leads to early pruning and forces 
the deduction into areas more likely to lead to a proof. 



Implementation 

PTTP-|-GLiDeS consists of a modified version of PTTP version 2e and MAGE 
vl.3.3, combined with a csh script. It requires the problem to be presented in 
both PTTP and OTTER formats. The OTTER format file is processed so that 
it contains only the model clauses, and is used by MAGE. 

Initially the domain size for the model to be generated by MAGE is set to 
equal the number of constants in the problem. If a model of this size cannot 
be found, the domain size is reset to 2 and MAGE is allowed to determine the 
domain size. If no model is found PTTP-|-GLiDeS exits. If a model is found, 
the modified PTTP is started. The modified PTTP uses the model to reorder 
the clause set, then transforms the reordered clauses into Prolog procedures 
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-money -tickets(X) 

{} 



, tickets(buy) -money 



-money ^-tickets(buy) money 
{ -tickets(buy) } 

money tickets(X) 



-money -tickets(buy) -money tickets(X) 
{ -tickets(buy), -money } 




-money -tickets(buy) | -money 
{ -tickets(buy), -money } 



-money 

{ -tickets(buy), -money } 

money -tickets(sell) 



-money] -tickets(sell) 

{ -tickets(buy), -money, -money } 

money tickets(X) 



-money |-tickets(sell)| money 
{ -tickets(buy), -money, 
-money, -tickets(sell) } 

foil and backtrack to ^ ' 



extension 



extension 



reduction 



truncation 



extension 



^extension 



reduction 



truncation 



if 



-money 

{ -tickets(buy), -money } 




money tickets(X) 



-money | tickets(X) 

{ -tickets(buy), -money, -money } 




money -tickets(seli) 



-moneyjtickets(seli) jmoney 
{ -tickets(buy), -mop 
-money, tick ) } 




-money|tickets(seli) 

{ -tickets(buy), -money, 
-money, tickets(seil) } 



□ 



Fig. 1. A PTTP+GLiDeS refutation 



that implement the ME deduction and maintain the A-list. A semantic check 
is performed on the A-list after each extension and reduction operation. If the 
A-list does not have an instance in which every literal evaluates to FALSE in 
the model provided by MACE, then the extension or reduction is rejected. 



Performance 

Testing was carried out on 541 “difficult” problems from the TPTP problem li- 
brary [12] v2.1.0. Both PTTP and PTTP+GLiDeS were tested on the same prob- 
lems under the same conditions. Experiments where carried out on a SunSPARC 
20 server using ECLiPSe v3.7.1 as the Prolog engine. A CPU time limit of 300 
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seconds was used. The results are summarized in Table 1. MACE failed to gen- 
erate a model in 272 cases, and so PTTP-|-GLiDeS couldn’t attempt those prob- 
lems. Of those 269 problem where models were generated, worst performance is 
on Horn problems: all of the problems solved by PTTP and not PTTP-|-GLiDeS 
are Horn. MACE tends to generate trivial models (positive literals TRUE) for 
Horn problems. If the top centre clause is negative then, for a Horn clause set, a 
trivial model does not lead to any pruning. With the additional overhead of the 
semantic checking this leads to poor performance by PTTP-|-GLiDeS. Of the 269 
models produced by MACE, 155 were effective, i.e., resulted in some pruning. Of 
the problems with effective models solved by both PTTP and PTTP-|-GLiDeS, 
in 13 out of 18 cases PTTP-|-GLiDeS had a lower inference count; in some cases 
significantly lower. This is shown by the fact that the average number of infer- 
ences for PTTP-|-GLiDeS is 2.5 times smaller than that of PTTP. This shows 
that the pruning is having a positive effect. PTTP-|-GLiDeS performs best on 
non-Horn problems. Table 2 shows some results for non-Horn problems where 
PTTP-|-GLiDeS performed better than PTTP. For these problems even trivial 
models can be of assistance. 



Total number of problems: 

CPU time limit: 

Number of models generated: 

Number of problems solved from 269: 

Number of effective models generated: 
Number of problems solved from 155: 



For the 18 problems (from 155) solved 

Average CPU Time; 

Average Number of Inferences: 
Average Number of Rej. Inferences: 



541 (311/230) 
300 s 

269 (227/42) 


(Horn/Non-Horn) 


PTTP 


PTTP-bCLiDeS 


66 (60/6) 

155 (120/35) 


59 (51/8) 


PTTP 


PTTP-bGLiDeS 


21 (16/5) 


20 (13/7) 


by both systems: 


PTTP 


PTTP-bCLiDeS 


34.24 


69.18 


119634.28 


47812.22 

3896.78 



Table 1. Summary of experimental data. 



With respect to ordering of the clause set, experiments have been carried out 
using both ascending and descending with respect to the truth score. Initially 
it was thought that ordering the clause set in ascending order of truth score 
(from ‘less TRUE’ to ‘more TRUE’) would lead the search away from pruning 
and therefore towards the proof. This turns out not to be the case. While the 
results are not statistically significantly different in terms of rejected inferences 
and inferences, descending ordering solved 4 more problems overall, of which 3 
had effective models. As solving problems is the most significant measure of a 
theorem prover’s ability this shows that pruning early is more effective. 
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Problem 


Model 


PTTP+GLiDeS 
GPU Inferences Rejected 
Time Inferences 


PTTP 

GPU Inferences 
Time 


CAT003-3 

CAT012-3 

GRP008-1 

SYN071-1 


non- Trivial 
Trivial 
Trivial 
non- Trivial 


68.5 64232 10451 

32.0 49220 4174 

248.3 404198 3937 

70.1 84908 27653 


TIMEOUT 
54.6 175367 

TIMEOUT 
262.8 832600 



Table 2. Results for some non-Horn problems where PTTP-|-GLiDeS out- 
performs PTTP. 



Conclusion 

In those cases where a strongly effective model has been obtained, results are 
good. This leads to the question, “what makes a model effective?” At present 
the first model generated by MACE is used. If the characteristics of a strongly 
effective model can be quantified then it should be possible to generate many 
models and select the one most likely to give good performance. 

PTTP is not a high performance implementation of ME, and thus the per- 
formance of PTTP and PTTP-|-GLiDeS is somewhat worse than that of current 
state-of-the-art ATP systems. This work has used PTTP to establish the via- 
bility of the semantic pruning strategy. It is planned to implement the pruning 
strategy in the high performance ME implementation SETHEO [5], in the near 
future. 

On the completeness issue, this prover prunes away proofs which contain 
complementary A-literals on different branches of the tableau. In the few cases 
examined to date, another proof that conforms to this extended admissibility 
rule has always been found. Whether there is always another such proof is not 
known. 
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Calculus up to a-Conversion 
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Abstract. We experiment a method for representing a concurrent ob- 
ject calculus in the Calculus of Inductive Constructions. Terms are first 
defined in de Bruijn style, then names are re-introduced in binders. The 
terms of the calculus are formalized in the mechanized logic by suitable 
subsets of the de Bruijn terms; namely those whose de Bruijn indices 
are relayed beyond the scene. The a-equivalence relation is the Leibnitz 
equality and the substitution functions can be defined as sets of partial 
rewriting rules on these terms. We prove induction schemes for both the 
terms and some properties of the calculus which internalize the re-naming 
of bound variables. We show that despite the fact the terms which for- 
malize the calculus are not generated by a last fixed point relation, we 
can prove the desired inversion lemmas. We formalize the computational 
part of the semantic and a simple type system of the calculus. Finally, 
we prove a subject reduction theorem and see that the specifications and 
proofs have the nice feature of not mixing de Bruijn technical manipu- 
lations with real proofs. 



1 Introduction 

Providing a satisfactory method to encode the binding operators of a program- 
ming language when we want to formalize it in a Logical Framework is still a 
challenge. Although many different methods have been proposed so far, none 
seems completely satisfactory. From de Bruijn codes to higher order encoding 
each method has is advantages and disadvantages, its supporters and its detrac- 
tors. A major problem raised by all these methods is that theorems and theirs 
proofs become highly linked with the chosen encoding. In other words, if it is 
sometimes possible to have specihcations and theorems close to the unmecha- 
nized version (which is not the case for de Bruijn encoding), the proof structures 
are themselves very different from the informal ones. In this paper, we show that 
the method proposed in [Gor94] for representing binders in mechanized logic can 
successfully be extended to a large calculus with different kinds of binders. Be- 
side, we show that, once some work has been done with de Bruijn indices, real 
proofs do not manipulate them moreover, they are similar to the unmechanized 
ones. 

We have chosen to formalize the concy-calculus [GH98], a concurrent object 
calculus consisting of M. Abadi and L. Cardelli’s imperative object calculus impy 
[AC96] extended with primitives from the 7r-calculus [MPW92]. This calculus 
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was introduced as a possible formalism for modeling computations based on 
concurrent processes and objects. We think that formal proofs of properties of 
protocols can be realized within proof-assistant if we have good methods for 
encoding such calculus. Our choice of this calculus is motivated by its size, its 
dilferent kinds of binders and its good expressiveness, thus giving an idea of 
the real problems that arise when we encode such formalisms in computational 
logics. 

The COQ system we use for our implementation is a proof-assistant based 
on the calculus of inductive constructions [Wer94], a higher order logic with 
dependent types and inductive dehnitions. All the proofs have been done with 
the user-interface CtCoq [BBC"*“97]. This paper has been written so as to be 
understood by people not familiar with the COQ system. We use mathematical 
notations instead of the COQ syntax and we only show signihcant parts of large 
COQ encodings. Please refer to [GilOO] for a full presentation of our technical 
results. 

Organization of the Paper: In section 2 we formalize the concy-calculus in 
COQ. In section 3 we prove a more powerful induction theorem for the terms 
of the calculus which internalizes the renaming of bound names. In section 4 we 
formalize the semantics of the concy-calculus in COQ and we produce an eflh- 
cient induction principle for the semantic relation. In section 5 we give a simple 
type system for the calculus and discuss the problems raised by the inversion 
lemmas generated by the COQ system on this example. Section 6 presents the 
statement and the proof of the subject reduction theorem. Section 7 is a short 
discussion about the formalizing technique we used. Finally, section 8 draws 
some conclusions. 

2 The conc<f-Calculus 

This calculus was hrst introduced by A. Gordon and P. Hankin. It is the im- 
perative object calculus impy of M. Abadi and L. Cardelli in which objects are 
located at addresses, extended with a parallel composition and the name re- 
striction operator from the 7r-calculus. The reader interested in a more detailed 
presentation of this calculus should refer to [GH98]. 

2.1 Informal Syntax of the Calculus 

We assume that there are inhnite disjoint sets of references, variables and labels. 
We distinguish references, representing addresses of stored object (channels in 
the TT-calculus) from variables, representing intermediate values (variables in the 
A-calculus) . Let us call both notions names. The expressions of the language are 
dehned as follows: 

In the method <;{x)b, and in the expression let x — a in b the variable x is 
bound in b. In a restriction, [yp).a the reference p is bound in a. The notation 
a — b means that the terms a and b are equal up to bound names renaming and 
reordering of the labeled components of objects. 
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Table 1. Syntax of the informal conc^-calculus 



1 

u,v::= 

x,y,z 

p,q 

a,b,c::= 
u,v 
p H- C? 
p.l 

p.l 4= 

clone(p) 

let X — a inb 

atb 

vpj.a 



labels 

result 

variables 

references 

terms 

result 

denomination 
method select 
method update 
cloning 
let 

parallel composition 
restriction 



d::= denotations 

pj = <;{xi)ai 'S'---"-] object 



2.2 De Bruijn Specification 

We define a de Bruijn syntax (table 2) in which free names n, m are encoded by 
named names [dB94], Variables x,y,z are either free variables x,y,z or de Bruijn 
variables (dvar i), (dvar j), labels l,li are named and references p,q are either 
free referenees p,q or de Bruijn references (dref i),(dref j). We assume there 
are infinite disjoint sets of referenees, variables and labels and that equality is 
decidable on each of these sets. 



Table 2. de Bruijn formahsm 

DB X I p I p ODB I DB.l \ 

DB.l 4=d(> DB I clone(DB) \ letdb DB in DB \ DB f DB \ Vdb-DB 

ODB []\[l:DB:: ODB]db 



Except if otherwise stated, in the sequel of the paper we no longer refer to 
referenees, variables, labels, results of the informal calculus described in 2.1. De 
Bruijn terms {DB) a,b,e are called dbterm, lists of dbterms {ODB) are called 
denotation and for readability reasons, var and re/ will be use for free variables 
and free references respectively in all our formal definitions. 

The binding constructors here are 4= if,, letdb (in their second argument) and 
[ ]db (the self variable of each method is bound by the objeet-eonstruetor) for de 
Bruijn variables. The Vdb operator is the only binding constructor for de Bruijn 
references. Objects are represented by lists. Although sets seem closer to the idea 
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of objects (a collection of attributes and methods), we cannot define object as 
sets because sets in COQ are specified as predicates, and predicates cannot be 
used in the type of a constructor of an inductive set. Moreover, COQ provides 
an efhcient tool for generating induction scheme for mutual inductive dehnitions 
when some type is a list of another 

Our syntax is a bit more general than the one proposed by A. Gordon for his 
concy-calculus in the sense that we allow cloning, method calling, and method 
updating not only for results but for all terms of our syntax. This choice was 
motivated because dbterm are less nested than they would have been with a de 
Bruijn result type. Since the concy-calculus terms will eventually be identihed 
by an inductively dehned subset of dbterm this will have no consequences on its 
formalization in COQ. 

Thanks to de Bruijn indices we do not need an alpha-equivalence notion and 
a — b means that the dbterm a and b are equal in the sense of the Leibnitz 
equality. De Bruijn formalization for binders takes off the syntax its intuitive 
meaning. We shall show how to recover it later on (see 2.5). 

As we have de Bruijn indices for both references and variables, we dehne two 
degree functions (computing the usual notion of degree for a term with de Bruijn 
indices [dB94]), one for each kind. We omit their formal dehnitions here so that 
readers do not get confused with too many technical details (see [GilOO] for more 
details). We generalize the notion of closed de Bruijn terms to our calculus such 
that a dbterm is said to be closed when both of its degrees are zero. 

2.3 Function as Binders 

Abstraction and Instantiation Functions. We dehne a variable abstraction 
function Absty. For a given dbterm a, Absty{a i x) is computed by substituting 
in a all the occurrences of the variable x by the de Bruijn variable [dvar {). The 
substitution is dehned recursively on the dbterms so as the de Bruijn indices 
substituted is increased by one each time a binder is met. In a dual way, we 
dehne an instantiation function Insty. Insty{a i x) is computed by substituting 
all the occurrences of {dvar i) in a by the variable x. Similarly, we dehne Absty 
and Insty on references. 

Functions as Constructors. We dehne new functions on dbterms let, res, and 
sigma behaving like the de Bruijn binding operators except that they use names 
in their arguments (see table 3). 

In the following, we shall write let x a mb for {let x a b), vp.a for {res p a) 
and q{x).a for {sigma x a). We will also drop the mark in a constructor when 
it can be guessed from the context. 

2.4 Substitution and a-Equivalence 

To relegate the de Bruijn indices of the underlying terms behind the scene we 
also need to dehne two new substitution functions Substy and Substy ■ Intuitively 

^ Using the Scheme tactic 
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Table 3. Functions as constructor 



{res p a) = Udb-Ahstria 0 p) (sigma x a) — Ahstv(a 0 x) 
(let X a b) leUh a in Abst„(a 0 x) 



these functions are defined to rename free names in dbterms. Their definitions 
(see definition table 4) use de Bruijn indices in their bodies but, with some work, 
we shall manipulate them without referring to de Bruijn indices (see 3.2). We 
write a[x/y\y and a\p/q]r for Substy(a y x) and Substr{a q p) respectively. 



Table 4. Substitution functions 



Substv(a X y) Instv(Abstv(a 0 x) 0 y) Substr(a q p) Instr(Abstr(a 0 g) 0 p) 



If we think of the v, let and g function as constructors and Substy and Substr 
as renaming functions, we prove that a-equi valent dbterms are encoded in our 
formalism by a unique dbterm. 



2.5 Formalization of the Syntax 

The result type (w,f,) is defined as the disjoint union of our free names: 

1. 4e/ I P 

result = var \ ref 

The inductive predicate Term (table 5) defines the subset of dbterm which for- 
malizes the concy-calculus. The proof of correctness of this encoding is straight- 
forward (omitted) if one thinks of let, res and sigma as constructors. 

,^From now on, we shall call Term, a dbterm having the Term property and 
write Va : Term.(P a) as a short hand for Va : dbterm. (Term a) ^ (P a) in the 
translation of our COQ notations. Terms are based on a de Bruijn formalism but 
de Bruijn indices are hidden in the syntax by the let, res and sigma functions. 



3 An Induction Principle for the conc<f-Calculus 

In the sequel, for the sake of clarity, we only show one (significant) case of the 
theorems. A more detailed presentation is available in [GilOO]. 
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Table 5. formaUzatioii of the conc^-calculus in GOQ 
Term : dbterm — Prop 
Resu: Vr : result.(Term r) 

— Deno: Vp : ref.'iobj : denotation. (OTerm obj) (Term p> H- obj) 

— Msel: V/ : labels.'iu : result. (Term u.l) 

— Mupd: V« : result.'ia : dbterm.'il : labels.'ix : var. 

(Term a) (Term u.l 4= <;(x).a) 

— Glone: V« : result.(Term (clone «)) 

— Let: Va, b : dbterm.'ix : var.(Term a) (Term b) (Term let x a in b) 

— Par: Va, b : dbterm. (Term a) (Term b) (Term a t b) 

— Res: Va : dbterm.'ip : ref. (Term a) (Term np.a) 

with 

OTerm : denotation — Prop 

Mnul: (OTerm Q) 

— Mocons: V/ : label.'ia : dbterm.'iobj : denotation.'ix : var. 

(Term a) (OTerm obj) (OTerm (I : ?(r).a :: obj)) 



3.1 The Induction Scheme Generated by the COQ System 

It appears that the induction scheme generated by COQ (table 6) for the pred- 
icate Term is not powerful enough for our purpose^. 

Table 6. Induction scheme generated by GOQ 

Term_ind:= 

VP : dbterm — Prop. 

(Vr : var.'ia, b : Term.(P a) (P 6) (P let x a in b)} 

Va, b : Term.(P a b). 



For example, it is not clear how one can derive the fact that Terms are closed 
under the substitution Substy and Substy with it. We shall not be able to deduce 
(Term (let x a in b>)[z/y\y) from (Term a[z/y\y) and (Term b[z/y\y) because 
we have not the necessary informations on x,y and to compute (let x 
a in b)[z/y\y. In this COQ formalization of the concy-calculus a-equivalence 
terms are equal. We want to integrate within the induction scheme of Term the 
fact that bound names, in Terms, can always be renamed. 



^ Actually we need to use the Scheme tactic to generate an efficient principle 
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3.2 An Intermediate Induction Scheme for Terms 

We define a new function length on dbterms which computes the numbers of 
constructors appearing in a term. The order <iength induced on dbterms by this 
function is well founded and we show that renaming names in a dbterm does 
not change its length. Using the general induction theorem for well founded 
relations with respect to <iength we prove a more powerful induction scheme 
than TermJnd for the Term relation (table 7). 



Table 7. Intermediate induction scheme 



Term Jengthdnd: = 
dP : dbterm — Prop. 

Vx : var.Va, b : Term.(P a) (V6' : Term. length(b) = length(b') (P b')) 

(P let X a in b)) 



Va, b : Term.{P a 6). 



With this theorem, as the length of dbterms is invariant for the renaming 
functions, we prove that substitution can be propagated inside binders for Terms 
if the side eonditions are satisfied (see table 8) . 



Table 8. substitution rewriting rules for the let binder 

let_rwl: Vr, y, z : var.'ia, b : Term.x y ^ x ^ z ^ 

(let X a in t>)\_z!y\ — let x := o[v/j/] in b[z/y]. 

Iet_rw2: Vr, y : var.'ia, b : Term. (let x a in b)[y/x] — let x := o[j//a^] in b. 



3.3 The Pull Induction Scheme for Terms 

The use of the length function inside TermJengthJnd is not satisfactory because 
this is not natural. We prove a final induction scheme on Term 
(table 9) using the TermJengthJnd theorem and the properties of the substitu- 
tions functions we have deduced from it. 

This induction scheme internalizes the property that bound names can always 
be chosen outside any set of names in the context. 

Example: Given a property P on Term, we prove that it holds for all Terms 
using the term-induetion theorem. In order to prove that {P let x a in b) 
holds, we seleet a finite set X and try to solve our goal under the assumptions 
{P a), {P b) and x f X. Giving the set X amounts to speeify that x is a fresh 
variable. 
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Table 9. Induction scheme for the Term predicate 

Term -induction: = 

VT : dbterm — Prop. 

(Va : Term.(P a) (3X : set. {Finite X)A 

Vr : var.'ib : Term, x ^ X ^ (P b) ^ (P let x a in 6))) 

Va, b : Term.(P a b). 



4 Semantics of the conc<f-Calculus 

The semantics of the calculus is given by a reduction relation and a structural 
congruence. The formalization of the reduction rules in COQ is natural and we 
can prove an induction scheme which internalizes a-renaming. 



4.1 Rules for the Semantics 

Informal Semantics. Terms of the calculus are interpreted either as processes 
or as expressions. Expressions and processes are concurrent computations but an 
expression is expected to return a result while a process is not. As opposed to 
many concurrent calculi the parallel composition (f) is not commutative. The 
term a f 6 is an expression in which a and b run in parallel. Its result is the result 
returned by b; any result returned by a is discarded. The structural congruence 
(=), except from the unusual behavior for (f) is standard. The reduction rela- 
tion (— 7-recf) (table 10) is analogous to the /?-reduction for the A-calculus. The 
structural congruence relation allows the rearrangement inside a term so that re- 
duction may be applied. Please refer to [GH98] for motivations and more details 
on this semantics. 



Table 10. Reduction relation: a 



For the first three rules, let d — [k — <;{xi)bi"‘^]. 

(p H- d) f p.lj — 7- (p H- d) f bj[p/xj] if j G l..n 

(p H- d) f {p.lj 4= ?(r).6) -A (p >-A d') t p if j G l..n 

d' — [Ij : <;{x).b,li — <;{xi)bi 

(p H- d) f {clone p) ^ {p i-A d) t nq.{{q H- d) f g) if g 0 fn{d) 
let X p> in a ^ a[p/*^] 

vpj.a -7- np.a' if a — t- 

(a f 6) — > {a' t b) if a — > a' 

(6 f a) — > (6 f a') if a — > a' 

let X a in b ^ let x := a' in b if a — t- 

a ^ b a = a' ,b = b' , 



b' 
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The notations fn[a) and fv{a) denote respectively the sets of free names 
and free variables in the expression a. The expression a\p/x\ is the notation for 
the substitution of the reference p for each free occurrence of the variable x in 
the expression a. 



Formalization in COQ. We use two inductive dehnitions to formalize the 
above relation in COQ. The hrst one (table 11) is a restriction to effective re- 
ductions in terms. The second one (table 12) is the complete formalization in 
the COQ system of the semantics (proof omitted here). We use this trick to 
prevent looping in the proofs. In the sequel of this paper we shall focus on the 
hrst dehnition. 

The COQ formalizations of both relations are the natural translations of the 
rules in table 10 into inductive dehnitions {-^red and -^evai)- The -^red relation 
is dehned for Terms and not dbterms so, for every dbterm a appearing in the 
COQ dehnition of -^red {Term a) must hold. 



Table 11. Formalization of the — 7-red relation in COQ 
—>-rcd- dbterms — dbterms — Prop 

Let_redl: Va : Term.dp : ref .{let x p in a) -^red a[p/r]. 



For any given term a, Subst^,{a x p), written a\p/x]if, is computed by substi- 
tuting all the occurrences of the variable x by the reference p in a. In the COQ 
system, Subst^, must be dehned on dbterms and de Bruijn indices are used in the 
body of this function. With the help of technical lemmas, we show that Subst^, 
restricted to Terms can be manipulated without dealing with de Bruijn indices. 



Table 12. Formalization of the reduction relation — 7-et>a( in COQ 
— 7-et>a(: dbterm — dbterm — Prop 

Eval. dctj Cl , 5, 5, . Term.{ci ^ ci ) (b ^ b ) (u ^rcd b ) (u ycva-i b{ 



4.2 Induction Scheme for the Semantics 

As before, for the sake of clarity, we only show one (signihcant) part of the theo- 
rems. Courageous readers could refer to [GilOO] for a more detailed presentation. 

We can extend the induction scheme for the -^red relation as we did for the 
Term predicate. In the induction scheme generated by the COQ system (table 
13) we do not have any informations for bound names. 
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Table 13. — 7-red induction scheme generated by GOQ 



Red_ind:= 

'iP : dbterm — dbterm — Prop. 

(Vp : ref.Va,b : Term. {a -^red b) {P a b) {P vp.a vp.b)) 
Va, b : dbterm.(a -^red b) ^ (P a b). 



Following the idea of the section 3 we can produce an extended induction 
scheme which internalizes the a-renaming of bound names in proofs. By using 
the general well founded induction theorem for a suitable order on pairs of dbterm 
we prove an intermediate theorem (table 14) in which the length of dbterm is 
introduced. 



Table 14. -7-red induction scheme with the length function 

Red Jengthdnd: = 

VR : dbterm — > dbterm — > Prop. 

(Vp : ref .da, b : Term. 

{da' , b' : Term. 

length{a) — length{a' ) length{b) — length{b') {a' — >red b') {P a' b')) 

(a ^-red b) ^ (P vp.a vp.b}} 

da, b : dbterm.(a -^red b) ^ (P a b). 



Finally , we prove the induction scheme (table 15) in which bound names can 
be chosen outside the set of names in the context. To prove the theorem, we first 
need an intermediary lemma stating that -dred is closed for names renaming. 
More precisely, we show that -dred is closed for names renaming provided that 
names are renamed in new names. 



Table 15. — >red induction scheme 



Reddnduction: = 

dP : dbterm -d dbterm -d Prop. 

(3Q : set. {Finite Q) A 

(Vp : ref. da, b : Term.p ^ Q ^ (a -dred b) (P ab) (P vp.a vp.b))) d- 
da, b : dbterm.(a -dred b) d^ (P a b). 
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5 Well-Formed Terms 

The conc^-calculus can be typed to distinguish expressions from processes. This 
very basic types system has only two types Exp and Proc standing for expres- 
sions and processes respectively. Basically, this typing system only ensures that 
proper processes cannot appear in a context expecting an expression and that 
references are correctly handled in a term. A term a is dehned as an expression 
or a proeess if a : Exp and a : Proc, respectively. 



5.1 Definition 

The typing rules are dehned in table 16. T stands for either Exp or Proc. The 
domain of a term a, dom{a) is the set of the free references representing the 
addresses of an object. Please refer to [GH98] for a general overview. 



Table 16. The well — formed relation 



(Well Result) (Well Clone) 
u : Exp clone(u) : Exp> 



(Well Res) 
a :T p £ dom(a) 
vp.a : T 



(Well Select) (Well Concur) 
a : Expj 

u.l : Expj a : Proc 



(Well Update) 
b : Expj dom(b) — 0 
u.l 4= ?(r)6 : Expj 



(Well Let) (Well Par) 

a : Expj b : Expo dom(b) =0 a : Proc b : T dom(a) PI dom(b) — 0 
let X a inb : Expo atb:T 

(Well Object) 

bi : Exp dom(bi) =0 Vi G l..n 
p i-> [h = : Proc 



The COQ formalization of the well-formed relation is its natural translation as 
an inductive dehnition well-formed, given table 16. well-formed is dehned for 
Term and not dbterms. We must insure than for every term a, {Term a) holds 
in the COQ dehnition. 



5.2 Inversion 

In the activity of proofs, inversion theorems are as important as induction 
schemes. In the usual cases, inversion theorems automatically generated by the 
proof assistants are those expected because the syntax of the calculi are dehned 
in terms of a least hxed point. In our formalism, binders are functions on top 
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Table 17. Formalization of well-formed in GOQ 
flag Exp I Proc 
well-formed : dbterms — flag — Prop 

WeH_Res: Va : Term.Vp : ref NT : flag. {well-formed a T) ^ 
p> G dom(a) (well-formed vpj.a T). 



of a de Bruijn syntax thus from the equality vp.a — vq.b it is not possible to 
deduce than p — q and a = 6. If the inversion theorems generated by COQ are 
used roughly they introduced news terms not directly related to anything in the 
proof. In table 18 we present the inversion theorem for the Well-Res constructor 
of the well-formed property generated by COQ 



Table 18. Inversion lemma for well-formed generated by GOQ 

VF : dbterm — flag — Prop. 

Va : Term.Vp : ref.VT : flag. 

(Vq : names.Vb : Term, vpj.a — vq.b^q G dom(b) ^ (well -formed b T) ^ ( P a T)) 
(well -formed up. a T) ^ (P a T). 



In a proof in which (well-formed vp.a T) is amongst the assumptions, using 
this theorem will not add (well-formed a T) in the hypothesis as expected. 
Similarly to induction schemes, the right inversion lemmas must be proved. For- 
tunately, it is sulhcient to derive a specialized lemma for each constructors of the 
inductive dehnition. Then the COQ system provides tactics to use them prop- 
erly'^. Because we can produce one lemma for each constructor, their formulation 
remains simple (see table 19 as an example). 



Table 19. Inversion lemma for the WelLRes constructor 

Lemma welLresJnv: VP : dbterm — flag — Prop. 

Va : Term.Vp : ref.VT : flag 

(p G dom(a) (well-formed a T) ^ (P a T)) (well-formed vp.a T) ^ (P a T). 



To prove welLres-inv we use a property stating than if vp.a — vp.b holds then 
a — b holds (such properties has to be proved for each of our binders) . To com- 
plete the proof, we must show that if (well-formed a T) holds for a term a then 

® The COQ system generates a general inversion theorem for well-formed; this is a 
specialized version 
^ The Inversion . . . using . . . , tactic 
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(well -formed a\p/q\ T) holds for any q and any p such that p ^ Q for a hnite 
set Q. In other words, we need to prove than well-formed is closed under ref- 
erence (name in general) renaming. Again, this is not surprising. This property 
of the relation well-formed should also be checked when we are reasoning up to 
a-conversion during informal proofs, through this is most often omitted. 



6 Subject Reduction Theorem 

We show that well-formed terms are closed for the -^red relation. The formu- 
lation of this theorem (see table 20) is exactly the same as its unmechanized 
version appearing in [GH98] . The proof is done by induction on -^red using the 
extended induction theorem (see table 15). For each induction case, there is 
an hypothesis of the form (well -formed a T). We use our inversion lemmas to 
extract informations on sub-terms of a from it. 



Table 20. Subject reduction theorem 



Theorem srt: 

Va, b : dbterm.'iT : flag, (well-formed a T) ^ (a -^red b) 
(well-formed b T) A dom(a) — dom(b). 



The proof of the theorem is very closed to the informal proof with implicit 
renamings of bound names. We do not manipulate de Bruijn indices neither are 
we doing a-renaming. All the lemmas used during the proof have a semantic 
contents. 

7 Discussion and Related Work 

The size of the different parts of the COQ code is summarize in the table below. 
In the column of Term we consider all the formalizations and proofs necessary 
for using the Terms. It includes the properties for the a-conversion and the re- 
naming, the proofs of the extended induction principles the Term property most 
of the lemmas we have proved with the theorem Term-induction. We classify in 
the column of welLformed and -Ared all the COQ codes which deal with the cor- 
responding property (induction schemes, inversion lemmas, behavior of Subst^, 
and dom). The srt column stand for the subject reduction theorem COQ codes 
part and the total column include all the previous ones plus some general lemmas 
(mainly set theory theorems) which do not use de Bruijn indices. 





Term 


welLformed 


^red 


srt 


total 


lines of COQ code 


7 700 


2 400 


2 500 


1000 


14 600 


% of de Bruijn code 


65% 


10% 


25% 


- 


40% 
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The percentage of the de Bruijn code in proofs is high during the setting of 
this technique. In fact, large de Bruijn codes mainly concern the Abst and Inst 
functions. But, once we have completely mastered the behaviors of Terms we do 
not use de Bruijn indices. 

Given a property P : T erm —7- . . . — T erm —7- Prop, we must show that 
there exists a hnite set X such that 

m ^ X ^ {P ai . . .a„) ^ {P ai[m/n] . . .a„[m/n]) to get an induction principle 
which internalizes name renaming and the expected inversion lemmas. Checking 
that P is closed by renaming of names can be laborious in COQ whereas this 
is assumed for on paper proofs. Moreover, as we have experimented during this 
development, it clears the way for further proofs on the P property. 

Related work. Among all the works formalizing the variable-binding oper- 
ators in calculi none, as far as we know, uses the technique we have used here. 
Daniel Hirschkoff has encoded a polyadic 7r-calculus with de Bruijn numbers and 
proved many bisimulation results [Hir97] . Bruno Barras [Bar95] formalizes COQ 
in COQ with de Bruijn indices. In both approaches de Bruijn indices appear in 
almost all theorems and specihcations. We think this is not natural. L. Henry- 
Greard [Hen98] uses R. Pollack and J. McKinna technique [MP93] to formalize 
the TT-calculus and prove a subject reduction theorem for it. In this technique, 
closer to the on paper formalism, there are two kinds of names, one for free ones 
and another for bound one. We think this is not completely natural. J. Despey- 
roux has investigated a higher-order approach in which the lambda abstraction 
of the logic is used for binding free variables of the calculus [Des]. See [DH94], 
for a general approach of this technique in COQ. F. Honsel, M. Miculan and 
I. Scagnetto [FI98] have encoded the 7r-calculus in COQ following a higher or- 
der approach. They use Co induetive types in their encoding of bisimulation. 
Although second order techniques are very elhcient, we think that proofs using 
these techniques are very different from proofs on paper. 

8 Conclusion and Future Work 

We have formalized a concurrent object calculus in the COQ system with names 
in binders using a technique proposed by A. Gordon [Gor94]. We have shown 
that dehning properties on Terms, namely those who formalize the concy-calculus 
in the COQ system, is very natural and easy because we just need to rewrite 
them using the COQ syntax. Under the assumption that a given property P 
is invariant under the renaming of names, the induction theorem generated by 
COQ for P can be strengthened to internalize a-renaming of bound variables. 
In spite of our syntax is not generated by a last hxed point we have inversion 
lemmas for P but they must be proved. The proofs of these theorems as the 
proof of the subject reduction theorem, are de Bruijn indices free. Moreover, the 
proofs dealing with real property of the calculus follow the general guideline of 
their on paper matching piece. 

The main drawback of this approach is that each time we have to dehne 
functions on Terms we have to dehne them on dbterms hrst, then prove that 
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they behave as expected on Terms. We believe that with a good understanding 
of the behavior of a function on Terms, it is not hard to give its dehnition on 
dbterms. We claim that this weakness does not overcome the advantages of the 
method. In fact, new functions on our syntax will probably use functions we 
have already dehned, allowing re-use of our COQ proofs (as it is done for the 
function Subst^, which appears in ~^red}- 

For property P, the strengthened induction theorems could be a large term. 
It is interesting to develop tools for generating it automatically because this 
extended induction scheme is mechanically derivable (not provable) from P. 
Another reasonable development could be to include tactics for automating, on 
Terms, the computation steps of functions. We have done some preliminary work 
in this direction. 
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Abstract. Fluted logic is a fragment of first-order logic without function 
symbols in which the arguments of atomic subformulae form ordered 
sequences. A consequence of this restriction is that, whereas first-order 
logic is only semi-decidable, fluted logic is decidable. In this paper we 
present a sound, complete and terminating inference procedure for fluted 
logic. Our characterisation of fluted logic is in terms of a new class of so- 
called fluted clauses. We show that this class is decidable by an ordering 
refinement of first-order resolution and a new form of dynamic renaming, 
called separation. 



1 Introduction 

Fluted logic is of interest for a number of reasons. One of our main motivations 
for studying fluted logic is the continuation of the programme of characterising 
first-order decidability by resolution methods. There are various ways of defining 
decidable fragments of first-order logic. Fragments considered until the sixties 
usually involve some form of restriction on quantification. In prefix classes such 
as the Bernays-Schonfinkel class, the initially extended Ackermann class, the ini- 
tially extended Godel class the quantifier prefixes are restricted, to 3*V*, 3*V3* 
and 3*W3*. In the guarded and loosely guarded fragments, which were intro- 
duced more recently, quantifiers are restricted to conditional quantifiers of the 
form 3yG{x, y) /\ Lp or \/yG(x, y) Lp, where G{x, y) is a guard formula satisfy- 
ing certain restrictions (G(x,y) is an atom in the case of the guarded fragment). 
In Maslov’s class K (more precisely, in the dual class K) there is a restriction 
on universal quantification. Other decidable classes such as the monadic class 
and FO^ are defined over predicate symbols with bounded arity. By contrast, 
the restriction of first-order logic which ensures decidability for fluted logic is an 
ordering on variables and arguments. With the exception of fluted logic, the men- 
tioned logics have been studied in the context of resolution and superposition, 
see for example Joyner [18], Fermiiller, Leitsch, et al. [5,6], Bachmair, Ganzinger 
and Waldmann [3], de Nivelle [4], Ganzinger and de Nivelle [7], Hustadt and 
Schmidt [15]. 
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Another reason for our interest in fluted logic is the relationship to non- 
classical logics. Extended modal logics and expressive description logics play an 
increasingly important role in various areas of computer science. Fluted logic 
may be viewed as a generalisation of modal logic, just as the guarded fragments 
can. The properties fluted logic is known to share with modal logics are de- 
cidability and the finite model property [21,22,24]. From a modal perspective 
an advantage of fluted logic over the guarded fragment is that relational atoms 
may be negated. This means that logics such as Boolean modal logic [8] and 
other enriched modal logics [9,12,13], as well as expressive description logics 
like ACB (without converse) [16], which cannot be embedded in the guarded 
fragment, can be embedded in fluted logic. Interestingly, translations of propo- 
sitional modal formulae by both the relational translation and a variation of the 
functional translation (described and used in [11,14]) are fluted formulae. This 
raises the question whether the results of Ohlbach and Schmidt [19,26] can be 
generalised to fluted logic. The answer to this question is negative, though. Al- 
ready the use of the quantifier exchange operator, which swaps existential and 
universal quantifiers in a non-standard fashion [19], leads to loss of soundness. A 
counter example is the relational translation of the second formula in the class 
of branching if-formulae, defined in [10, Prop. 6.5]. 

Historically, fluted logic arose as a byproduct of the predicate functor logic 
introduced by Quine [25] . Adding various combinatory operators to fluted logic 
defines a lattice of fluted logics, in which fluted logic is the weakest logic and 
first-order logic with equality is the most expressive logic. (The combinatory 
operators are equality, binary converse, permutation of arguments, addition of 
vacuous arguments, fusions of arguments, and composition of binary atoms.) In 
a series of papers [21,22,24] Purdy studies the decision problem of fluted logics 
in this lattice, and establishes the limit of decidability to be the boundary of 
the ideal generated the fluted logic with binary converse and equality [24] . This 
logic is the most expressive decidable logic in the lattice of fluted logics. In [23] 
Purdy describes an application in computational linguistics of fluted logics for 
modelling ordinary English. 

In this paper we characterise fluted logic by a new class of clauses, called 
the class of fluted clauses. We present a decision procedure for this class which 
is based on an ordering refinement of resolution and an additional separation 
rule. This is a new inference rule which does dynamic renaming. It replaces 
a clause C V I? by two clauses ~^A V C and A \/ D, where A is an atom 
with a newly introduced predicate symbol. The rule is sound, in general, and 
resolution extended by this rule remains complete, if for any set N of clauses 
the number of applications of separation in any derivation from N is finitely 
bounded. Separation is essential for our decision procedure, since it allows us to 
transform certain problematic fluted clauses into so-called strongly fluted clauses. 
A strongly fluted clause is a fluted clause that contains a literals which includes 
all the variables of the clause. When inference is restricted to such literals (i) the 
number of variables in any derivable clause is finitely bounded, in particular, the 
number of variables does not exceed the number of variables in the original clause 
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set. To show termination, it is usually sufficient [18] to show in addition that 
(ii) there is a bound on the depth of terms occurring in derived clauses. Because 
separation introduces new predicate names during the derivation, in our case we 
also need to show that (iii) there is a bound on the number of applications of 
the separation rule. Exhibiting (ii) and (iii), along with verifying the deductive 
closure of the class of (strongly) fluted clauses are the most difficult parts of the 
termination proof. The difficulty can be attributed to the fact that the depth 
of terms can grow during the derivation, as is the case for some other solvable 
clausal classes, for example, those associated with Maslov’s class K [5,14]. 

The paper is organised as follows. Fluted logic is defined in Section 2. Sec- 
tion 3 gives a brief description of the general ordered resolution calculus. The 
class of fluted clauses is deflned in Section 4. In Section 5 we specify how fluted 
formulae can be translated into sets of fluted clauses. The new separation rule 
is deflned in Section 6. In Section 7 we define an ordering refinement and prove 
termination. We conclude with some remarks about the complexity of the class. 
Because of the space limitations all proofs had to be omitted, but can be found 
in [17]. 

Throughout, our notational convention is the following: x,y, z are the letters 
reserved for first-order variables, s, t, u, v for terms, a, b for constants, /, g, h for 
function symbols, and p,q,r,P,Q,R for predicate symbols. The Greek letters 
if, ip, (j) are reserved for formulae. A is the letter reserved for atoms, L for literals, 
and C, D for clauses. For sets of clauses we use the letter N. 



2 Fluted Logic 

Let P be a finite set of predicate symbols and let = {x\^ . . . ^Xm} be an 
ordered set of variables. An atomic fluted formula of V over Xi is an n-ary 
atom P{xi, ... ,Xi), with I = i — n + 1 and n < i. Fluted formulae are deflned 
inductively as follows: 

1 . Any atomic fluted formula over Xi is a fluted formula over Xi . 

2. 3xi+\ip and are fluted formulae over Xi, if is a fluted formula over 

Xi+\. 

3. Any Boolean combination of fluted formulae over Xi is a fluted formula over 

Xi. That is, (p ^ if , ip A fj, etcetera, are fluted formulae over Xi, if both 

p and if are. 

By definition, for any formula p, if there is a variable renaming h such that h{p) 
is a fluted formula according to the above definition then is a fluted formula. 
In this paper the assumption is that all fluted formulae are closed. 

The semantics of fluted logic is deflned like the semantics of first-order logic. 
Three examples of fluted formulae from a linguistic or knowledge represen- 
tation setting are the following, (mwmc is short for ‘married couple all of whose 
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children are married’.) 

Va;i(cheese-eater(a;i) ^ 3a;2(cheese(a;2) A eats(a;i, a;2))) 

Va;i(cheese-lover(a;i) ^ Va;2(cheese(a;2) ^ (eats(a;i, X2) A likes(a;i, a;2)))) 
Va;ia;2(mwmc(a;i, X2) ^ (married(a;i, 0:2) A 

Vx3(have-child(a;i, a;2, X3) ^ 3a;4married(x3, X4)))) 

The first formula can be expressed by a multi-modal formula, while the second 
can only be expressed in a modal logic with an enriched language like Boolean 
modal logic or description logics with role negation. Because guards may only 
have a certain polarity the second formula does not belong to the guarded or 
loosely guarded fragments. The third formula is also not guarded and does not 
belong to Maslov’s class K either. On the other hand, the formulae 

Va;ia;2(married(a;i, 0:2) A Va;3(is-child(a;3, xi,X2) doctor(a;3)))) 
Va;ia;2(married(a;i, a;2) A 3a;3(have-child(a;i, a;2, 0^3) ^ 3x4married(a;4, a;3))) 
Va;ia;2a^3(ancestor(a;i, a;2) A ancestor(a;2, 3^3)) ^ ancestor(a;i, 3:3) 

are not fluted formulae, because in all instances the ordering of the arguments 
is violated in some atom. 



3 Resolution 

The usual definition of clausal logic is assumed. A literal is an atom or the 
negation of an atom. The former is said to be a positive literal and the latter a 
negative literal. In this paper clauses are assumed to be multisets of literals, and 
will be denoted by P{x) V P{x) V -^R{x,y), for example. The components in 
the variable partition of a clause are called variable- disjoint or split components, 
that is, split components do not share variables. A clause which cannot be split 
further will be called a maximally split clause. The condensation cond(C') of a 
clause C is a minimal subclause of C which is a factor of C. We take equality 
of clauses (or formulae) to be equality modulo variable renaming. Two clauses 
(or formulae) that are equal modulo variable renaming are said to be variants 
of each other. 

We say an expression is functional if it contains a constant or a non-nullary 
function symbol. Otherwise it is called non-functional. An expression is shallow 
if it does not contain a non-constant functional term. The set of variables of an 
expression E will be denoted by var(if). 

Next, we briefly recall the definition of ordered resolution from Bachmair and 
Ganzinger [1,2]. Derivations are controlled through an admissible ordering )^. In 
the full calculus a second parameter, a selection function, may be used, but for 
the results of this paper it is not essential. 

By definition, an ordering is admissible, if (i) it is a total well-founded 
ordering on the set of ground literals, (ii) for any atoms A and B, it satisfies: 
-^A >- A, and B >- A implies B ^A, and (iii) it is stable under the application 
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Deduce: 



Delete: 



Split: 



N 



Aru{cond(C)} 

Aru{C} 

N 

NU{CV D} 



if C is a factor or resolvent of 
premises in N. 

if C is redundant. 



if C and D are variable-disjoint. 



Aru{C} I ATU{D} 

Resolvents and factors are computed with: 

Ordered resolution: ^ ^ ^ ^ ^ 

{C V D)a 

provided (i) a is the most general unifier of Ai and A 2 , (ii) Aicr is strictly 
maximal with respect to Ca, and (iii) ^A 2 cr is maximal with respect to Da. 

Ordered factoring: ^{C y^A 

provided (i) a is the most general unifier of A\ and A 2 , and (ii) Aicr is 
maximal with respect to Ccr. 



Fig. 1. The calculus R 



of substitutions. (An ordering is said to be liftable if it satisfies (iii).) The multiset 
extension of provides an admissible ordering on clauses. A literal L is said to 
be (strictly) maximal with respect to a clause C if for any literal L' in C, L' L 
{L' ^ L.) A literal in a clause C is said to be eligible if it is maximal with respect 
to C. An ordering is compatible with a given complexity measure cl on ground 
literals, if cl cl> implies L >- L' for any two ground literals L and L' . 

Let R be the resolution calculus defined by the rules of Figure 1. The com- 
pleteness proof sanctions a global notion of redundancy, with which additional 
don’t-care non-deterministic simplification and deletion rules can be supported. 
Essentially, a ground clause is redundant in a set N with respect to the ordering 
if it follows from smaller instances of clauses in N, and a non-ground clause 
is redundant in N if all its ground instances are redundant in N . For example, 
any tautologous clause is redundant. 

A {theorem proving) derivation from a set N of clauses is a finitely branching 
tree with root N constructed by applications of the expansion rules. A derivation 
Tis a refutation if for every path N{= A^o): -^i) ■ ■ ■ , the clause set IJ^ Nj contains 
the empty clause. A derivation T from N is called fair if for any path N{= 
No),Ni, ... in the tree T, with limit N^o = Uj rifc>j ^k, it is the case that each 
clause C that can be deduced from non-redundant premises in Nao is contained 
in some set Nj. 



Theorem 1 ([3]). Let N be a set of clauses and let T be a fair R-derivation 
from N (up to redundancy). Then, N is unsatisfiable iff for every path N{= 
Nq), Ni, ... , the clause set IJ^ Nj contains the empty clause. 
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It should be noted that inferences with ineligible literals are not forbidden, 
but are provably redundant. In other words, only inferences with eligible literals 
need to be performed for soundness and completeness. 

Strictly, the “Split” rule is inessential for the results of this paper, though 
it may have some computational advantages (we comment on this in the final 
section) . However, the inclusion of splitting allows for a more concise presentation 
of fluted clauses. 

4 Fluted Clauses 

This section introduces the class of fluted clauses into which fluted formulae 
can be translated. Without loss of generality we consider only maximally split 
clauses. 

In fluted clauses the arguments of literals have a characteristic form which 
will be described with the help of a sequence notation, (ui) will denote a fi- 
nite, possibly empty, sequence (uj, Ui+i, . . . , Um) of terms. In this paper unless 
specified otherwise each non-empty sequence (ui) is assumed to end with Um- 
Thus, the sequences (ui), (^2)5 ■ ■ ■ j (um) are linearly ordered by (the converse 
of) the ‘is a proper suffix of’ relationship. Note that (um) = Um- Given that 

(^z) (^Z5 • ■ ■ 5 

(ui, t) will denote the sequence (uj, . . . , Um, t), 
f{ui) will denote the term f{ui, . . . , Um), 

P{ui) will denote the atom P{ui, . . . , Um), 

C{ui) will denote a (possibly empty) clause of literals 
of the form (^)P(u,). 

If (ui) is the empty sequence then f(ui), P(ui) and C{ui) respectively denote 
a constant, a propositional literal and a (possibly empty) propositional clause. 
(ui) is said to be the argument sequence of f(ui), P(ui) and C{ui). A sequence 
with n elements will be called an n-sequence. 

Assume m is a non-negative integer, and = {a^i, ■ ■ ■ , Xm\ is a set of m 
ordered variables. We refer to a sequence of terms u = (ui, . . . , m„) as a fluted 
sequence over Xrm if the following conditions are all satisfied: (i) n > m, (ii) ui = 
xi, . . . , Um = Xm, (hi) the number of variables occurring in (um+i, ■ ■ ■ , Un) is 
m, and (iv) for every k with m < k < n, there is an i with 1 < i < k such that 
Uk = f{ui , . . . , Uk-i) for some function symbol /. The sequence (a;i, . . . , Xm) 
will be called the variable prefix of u. Examples of fluted sequences are: 

(a), a fluted sequence over Xq = 0, 

{xi,X2,Xz, f{xi,X2,xfl)), 

{xi,X2, X3, f{x2,X3),g{xi,X2, X3, f{x2, X3)), 

{xi,X2, f{xi,X2),g{f{xi,X2)), h{x2, f{xi,X2),g{f{xi,X2)))). 

However, (a:i, X2, X3, f{x2, 3:3)) is not a fluted sequence, as condition (iii) is vio- 
lated. 
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By definition, a clause C is a fluted clause over Xm if one of the following 
holds. 

(FLO) C is a (possibly empty) propositional clause. 

(FLl) C is not empty, var(C') = Xm, and for any literal L in C, there is 
some i where 1 < z < m such that the argument sequence of L is 

(xz, Xz-|_i, . . . , Xm}- 

(FL2) C is functional and not empty, var(C') = Xm, and for any literal LinC the 
argument sequence of L is either (xi, Xi +\, . . . , Xm) or {uj, Uj +\, . . . , zz„), 
where 1 < z < m and {uj,Uj+\, . . . , zz„) is a suffix of some fluted sequence 
u = (zzi, . . . , Un) over {xk, ■ ■ ■ , Xm}, for some k with 1 < k < m. u 
will be referred to as the fluted sequence associated with L. (By 4. of 
Lemma 1 below there can be just one fluted sequence associated with a 
given literal.) 

(FL3) C is not empty, var((7) = Xm+i, and for any literal L in C, the argument 
sequence of L is either (a;i, X 2 , ■ ■ ■ , Xm) or {xi , . . . , Xm, Xm+i), where 1 < 
i < m. 

A fluted clause will be called a strongly fluted clause if it is either ground or has 
a literal which contains all the variables of the clause. 

It may be helpful to consider some examples. The clause 

P{xi,X 2 ,X^,Xi,X 5 ) V Q{xi,X 2 ,Xz,Xi,X 5 ) V ~^R{xi,X 5 ) V S{X 5 ). 

satisfies the scheme (FLl), and is defined over five variables. Examples of fluted 
clauses of type (FL3) which are defined over two (!) variables are: 

Q{xi,X 2 ) V ^P{xi,X 2 ,xflj V ^R{x 2 ,xflj V S{xflj 
Q{xi,X 2 ) V ^R{x 2 ,xflj V S{xflj 

The following are fluted clauses of type (FL2), where (xi) = {xi , . . . , X 4 ). 

R{x 2, f{xi,X 2 )) V S{f{xi,X 2 )) 

Q{xi,X 2 ) V R{x 2 ) V P{xi,X 2 , f{xi,X 2 )) V R{x 2 , f{xi, X 2 )) V S{f {xi,X 2 )) 

Q{x3,X4) V R{x4) V P{X 4 , g{xi), f{x 4 , g{xi))) V R{f{x 4 ,g{xi))) 

Q{xi) V P{f{xi,h{xi)),g{xi,h{xi), f{xi,h{xi))) 

V R{x 4 ,h{xi),g'{x 4 ,h(xi))) 

A few remarks are in order. First, the non-functional subclause of a (FL2)- 
clause will be denoted by V. Note that V satisfies (FLl), in other words, clauses 
of the form (FLl) are building blocks of (FL2)-clauses. Second, clauses of the 
form (FL3) are defined to be fluted clauses over m variables, even though they 
contain m-|- 1 variables. This may seem a bit strange, but this definition ensures 
a direct association of fluted formulae over m variables to fluted clauses over m 
variables. Third, no fluted clause can simultaneously satisfy any two of (FLO), 
(FLl), (FL2) and (FL3). Fourth, using the previously introduced notation a 
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schematic description of the non-propositional shallow clauses (FLl) and (FL3) 
is: 



C{xi) V C{X 2 ) V ... V C{xm) (= V) 

C(^i) V C(xi,Xm+i) V C(x 2 ,Xm+i) V ... V C(xm, x^+i) V Cix^+i) 

Fifth, strongly fluted clauses have special significance in connection with termi- 
nation of resolution, particularly with respect to the existence of a bound on the 
number of variables in any clause. Under the refinement we will use the eligible 
literals are literals which contain all the variables of the clause. So, the number 
of variables in resolvents of strongly fluted premises will always be less than or 
equal to the number of variables in any of the parent clauses. 

The next results give some properties of fluted sequences and strongly fluted 
clauses. 

Lemma 1. Let u he a fluted sequence over Xm- Then: 

1. There is an element Uk ofu such that Uk = /(ui, ■ ■ ■ , Uk-i), for some f. 

2. If Un is last element ofu then var(u„) = X^- 

3. u is uniquely determined by its last element. 

4- If (uj, Uj+i, . . . , Un) is a suffix of u then (ui, . . . , Uj-i) is uniquely deter- 
mined by {uj, Uj+i , . . . , Un). 

By the definition of (FL2)-clauses: 

Lemma 2. Let L he any literal of a (FL2)-clause defined over Xm. Then, all 
occurrences of variable sequences in L are suffixes of (a;i, . . . , Xm)- 



Lemma 3. Let C he a fluted clause over m variables. C is strongly fluted iff 
1. C satisfies exactly one of the conditions (FLO), (FLl), (FL2), or 2. C satisfies 
condition (FL3), and it contains a literal with m-\-l variables. 

In other words, with the exception of certain (FL3)-clauses all fluted clauses 
include at least one literal which contains all the variables of the clause. 

5 Prom Fluted Formulae to Fluted Clauses 

Our transformation of fluted formulae into clausal form employs a standard 
renaming technique, known as structural transformation or renaming, see for 
example [20]. For any first-order formula (p, the definitional form obtained by 
introducing new names for subformulae at positions in A will be denoted by 
Def^((^). 

Theorem 2. Let p he a first-order formula. For any subset A of the set of 
positions of p, 1. <p> is satisfiable iffDefA{p) is satisfiable, and 2. Def/i((^) can 
be computed in polynomial time. 
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In this paper we assume the clausal form of a first-order formula (p, writ- 
ten Cls((^), is computed by transformation into conjunctive normal form, outer 
Skolemisation, and clausifying the Skolemised formula. 

By introducing new literals for each non-literal subformula position, any given 
fluted formula can be transformed into a set of strongly fluted clauses. 

Lemma 4 . Let <p> he any fluted formula. If A contains all non-literal subformula 
positions of (p then ClsDef/i((^) is a set of strongly fluted clauses (provided the 
newly introduced literals have the form {-^)Q\{xi)). 

Transforming any given fluted formula into a set of fluted clauses requires 
the introduction of new symbols for all quantified subformulae. ^ 

Lemma 5 . Let p he any fluted formula over m ordered variables. If A contains 
at least the positions of any subformulae Bxi+if:, then ClsDefyi((p) is a 

set of fluted clauses (again, provided the new literals have the form {-AjQxixf)). 

6 Separation 

The motivation for introducing separation is that the class of fluted clauses 
is not closed under resolution. In particular, resolvents of non-strongly fluted 
(FL 3 )-clauses are not always fluted and can cause (potentially) unbounded vari- 
able chaining across literals. This is illustrated by considering resolution between 
Pl{xi,X2) V Ql{x2,xfj V R{x2,xfj and ^R{xi,X2) V P2{xi,X2) V Q2{x2,xfj, 
which produces the resolvent Pi{xi,X2) V Qi{x2,xf) V P2{x2,xf) V Q2{x3,xf). 
We note that it contains four variables, whereas the premises each contain only 
three variables. The class of strongly fluted clauses is also not closed under reso- 
lution. Fortunately, however, inferences with two strongly fluted clauses always 
produce fluted clauses, and non-strongly fluted clauses are what we call separable 
and can be restored to strongly fluted clauses. 

Consider the resolvent C = P{x\,X2) V P{x2,xf) of the strongly fluted 
clauses P(a;i,a;2) V R{xi,X2,xf) and ^R{xi,X2,xf) V P{x2,xf). C is a fluted 
clause of type (FL 3 ), but it is not strongly fluted, as none of its literals contains 
all the variables of the clause. Consequently, the literals are incomparable under 
an admissible ordering (in particular, a liftable ordering), because the literals 
have a common instance, for example C{xi a,X2 a, a;3 i-^- a} = P{a, a) V 
P{a,a). The ‘culprits’ are the variables xi and X3. Because they do not occur 
together in any literal, C can be separated and replaced by the following two 
clauses, where g is a new predicate symbol. 

^q{x 2 ) V P{xi,X 2 ) 
q{x 2 ) V P{x 2 ,xf) 

^ More generally, it requires at least the introduction of new symbols for all positive 
occurrences of universally quantified subformulae, all negative occurrences of existen- 
tially quantified subformulae, and all quantified subformulae with zero polarity. But 
then inner Skolemisation needs to be used, first Skolemising the deepest existential 
formulae. 
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The first clause is of type (FLl) (and thus strongly fluted) and the second is a 
strongly fluted clause of type (FL 3 ). 

In the remainder of this section we will formally define separation and con- 
sider under which circumstances soundness and completeness hold. In the next 
section we will show how separation can be used to stay within the class of fluted 
clauses. 

Let C be an arbitrary (not necessarily fluted) clause. C is separable if it can 
be partitioned into two non-empty subclauses Di and D2 such that var(Di) ^ 
var(D2) and var(D2) % var(Di). For example, the clauses P(a;i,a;2) V Q{x2,X3) 
and P{xi) V Q{x2) are separable, but P{xi,X2) V Q and P{xi,X2) V Q{x2, X3) V 
R{xi,xs) are not. (The last clause is not fluted.) 

Theorem 3 . Let C V D be a separable clause such that var(C') % var(D), 
var(D) 2 var(C'), and var(C') n var(D) = {a:i, . . . , x„} /or n > 0 . Let q be a 
fresh predicate symbol with arity n (q does not occur in N). Then, N\J{C \J D} 
is satisfiable iff N U {^q{xi , . . . , x„) V C, q{x \, . . . , x„) V D} is satisfiable. 

On the basis of this theorem we can define the following replacement rule: 

o „ . NU{CVD} 

separate. ^ ^ , x„) V C, 9(^1, ■ ■ ■ , cr„) V i?} 

provided (i) C V D is separable such that var(C') % var(D) and 
var(D) % var(C'), (ii) var(C') n var(D) = {a;i,... , x„} for n > 0 , 
and (iii) q does not occur in N, C or D. 

C and D will be referred to as the separation components of C V D. 

Lemma 6 . The replacements of a separable clause C each contain less variables 
than C . 

Even though it is possible to define an ordering under which the replacement 
clauses are strictly smaller than the original clause, and consequently, C V I? is 
redundant in TV U {^q{x \, . . . , x„) V C, q{x \, . . . , x„) V D}, in general, “Sepa- 
rate” is not a simplification rule in the sense of Bachmair-Ganzinger. Neverthe- 
less, we can prove the following. 

Theorem 4 . Let R®®p denote the extension of R with the separation inference 
rule. Let N be a set of clauses and let T be a fair -derivation from N such that 
separation is applied only finitely often in any path ofT. Then N is unsatisfiable 
iff for every path N(= Nq), Ni, . . . , the clause set [Jj Nj contains the empty 
clause. 

More generally, this theorem holds also if R®®p is based on ordered resolution 
(or superposition) with selection. 

By Lemma 3 separable fluted clauses have the form 



C{xi) V C{Xi, Xm-\-l) V ... V C{Xm, Xm-\-l) V C{Xm-\-l), 



( 1 ) 
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where (1 ei) is a non-empty m-sequence, C{x\) is not empty, and i is the smallest 
integer 1 < z < m such that C{xi, Xm+i) is not empty. 

Let sep be a mapping from separable fluted clauses of the form (1) to sets of 
clauses defined by 

sep(C') = {^q{xi) V C{xi), 

q{Xi) V C{Xi, Xjn+l) V ... V C{Xra, Xm+l) V C{Xm+l)) 

where g is a fresh predicate symbol uniquely associated with C and all its vari- 
ants. Further, let sep(fV) = lJ{sep(C') | C G N}. For example: 

sep(P(a;i,a;2) V Q(x2,X3)) = {^q(x2) V P(xi,X2), q(x2) V Q(x2,X3)}. 

Lemma 7. The separation of a separable fluted clause (1) is a set of strongly 
fluted clauses. 



Lemma 8. For fluted clauses a separation inference step can he performed in 
linear time. 

7 Termination 

In this section we define a minimal resolution calculus and prove that it 
provides a decision procedure for fluted logic. 

The ordering of is required to be any admissible ordering com- 
patible with the following complexity measure. Let >~s denote the proper su- 
perterm ordering. Define the complexity measure of any literal L hy cl = 
(ar(L), max(L), sign(L)), where ar(L) is the arity (of the predicate symbol) of L, 
max(L) is a l^^-maximal term occurring in L, and sign(L) = 1, if L is negative, 
and sign(L) = 0, if L is positive. The ordering on the complexity measures is 
given by the lexicographic combination of >, >~a-, and > (where > is the usual 
ordering on the non-negative integers). 

Let R®®P be any calculus in which (i) derivations are generated by strategies 
applying “Delete”, “Split”, “Separate”, namely, IVUlCj/IVU sep(C), and “De- 
duce” in this order, (ii) no application of “Deduce” with identical premises and 
identical consequence may occur twice on the same path in derivations, and (iii) 
the ordering is based on )^, defined above. 

Now we address the question as to whether the class of fluted clauses is closed 
under R®®P-inferences. 

Lemma 9. A factor of a strongly fluted clause C is again a strongly fluted clause 
of the same type as C . 

In fact, any (unordered) factor of a strongly fluted clause C is again a strongly 
fluted clause of the same type. 

The next lemma is the most important technical result. 
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Lemma 10. Let C = C \J A\ and D = ~^A 2 V D' he (FL2)-clauses. Suppose 
Ai and ^^2 are eligible literals in C and D, respectively, and suppose a is the 
most general unifier of Ai and A 2 . Then: 

1. Ca, Da and Ca V Da are (FL2)-clauses. 

2. For any functional literal La in Ca V Da, the fluted sequence associated 
with La is the a-instance of a fluted sequence v associated with some literal 
L' in C y D. 



Lemma 11. Let C = C y A\ and D = -^A^ V D' he strongly fluted clauses. 
Suppose Ai and ^^2 are eligible literals in C and D, respectively, and suppose 
a is the most general unifier of Ai and A 2 . Then Ca V Da is a strongly fluted 
clause. 



Lemma 12. Let C, D and a he as in Lemma 11. Then, |var(C'(T V Da)\ < 
max{|var(C')|, |var(D)|}. 



Lemma 13. Removing any subclause from a fluted clause produces a fluted 
clause. 

This cannot be said for strongly fluted clauses, in particular, not for clauses of 
the form (FL3) . For all other forms the statement is also true for strongly fluted 
clauses, namely, removing any subclause from strongly fluted clauses produces 
strongly fluted clauses. Consequently: 

Lemma 14. The condensation of any (strongly) fluted clause is a (strongly) 
fluted clause. 



Lemma 15. The resolvent of any two strongly fluted clauses is a strongly fluted 
clause, or, it is only a fluted clause, if one of the premises is a (FL3)-clause. 



Lemma 16. Any maximally split, condensed and separated factor or resolvent 
of strongly fluted clauses is strongly fluted. 

This proves that the class of (strongly) fluted clauses is closed under R®®p_ 
resolution with eager application of condensing, splitting and separation. 

In the next three lemmas, N is assumed to be a finite set of fluted clauses 
(which will be transformed into a set of strongly fluted clauses during the deriva- 
tion, see Lemma 7). Our goal is to exhibit the existence of a term depth bound 
of all inferred clauses, as well as the existence of a bound on the number of 
variables occurring in any inferred clause. The latter follows immediately from 
Lemmas 6 and 12. 



Lemma 17. All clauses occurring in an -derivation from N contain at most 
m-\- 1 variables, where m is the maximal arity of any predicate symbol in N. 
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The definition of fiuted clauses places no restriction on the level of nesting 
of functional terms. But: 

Lemma 18. A hound on the maximal term depth of clauses derived by R®®p 
from N is m, where m is the maximal arity of any predicate symbol in N. 

Because the signature is extended dynamically during the derivation, it re- 
mains to show that separation cannot be performed infinitely often. 

Lemma 19. The number of applications of the “Separate” -rule in an R®®P- 
derivation from N is bounded. 

Now, we can state the main theorem of this paper. 

Theorem 5. Let (p be any fluted formula and N = ClsDefyi((p), where Def/i 
satisfies the restrictions of Lemma 5. Then: 

1. Any -derivation from N (up to redundancy) terminates. 

2. Lp is unsatisfiable iff the R^^'^ -saturation (up to redundancy) of N contains 
the empty clause. 

The final theorem gives a rough estimation of an upper bound for the space 
requirements. 

Theorem 6. The number of maximally split, condensed strongly fluted clauses 
in any R^^'^ -derivation from N is an 0{m)-story exponential, where m is the 
maximal arity of any predicate symbol in N. 

8 Concluding Remarks 

Developing a resolution decision procedure for fiuted logic turned out to be 
more complicated than expected. Even though to begin with, clauses are simple 
in the sense that no nesting of non- nullary function symbols occurs (Lemma 4), 
the class of fiuted clauses is rather complex. It is thus natural to ask whether 
there is a less complex clausal class which corresponds to fiuted logic. The com- 
plexity of the class is a result of the ordering we have proposed. This ordering is 
unusual as it first considers the arity of a literal, while more conventional order- 
ing refinements first consider the depth of a literal. A conventional ordering has 
the advantage that term depth growth can be avoided completely. This would 
induce a class of clauses which can be described by these schemes: 

propositional clauses (2) 

C(xi) V C(X 2 ) V ... V C(xm) (= V) (3) 

V V C(xi, f{xi)) V C{x 2 , f{xi)) V ... V C{x„i, f{xi)) V C{f{xi)) (4) 
C(xi) V C{xi,Xra+l) V C{x 2 ,Xm+l) V ... V C{Xm, Xm+l) V C{Xra+l) (5) 

The difference between this class and the class of ffuted clauses defined in Sec- 
tion 4 is scheme (4). Clauses satisfying scheme (4) are (FL2)-clauses, but not 
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every (FL2)-clause has the form (4). With separation (on (5)-clauses without an 
embracing literal) it is possible to stay within the confines of this class. However, 
the danger with the separation rule is that it could be applied infinitely often. It 
is open whether there is a clever way of applying the separation rule so that only 
finitely many new predicate symbols are introduced. For fluted logic with binary 
converse an example giving rise to an unbounded derivation is the following. 

P2(a;i,a;2,a;3) V Hi(/(a;i, ^2, a^s), 

^P2{xi,X2jXs) V ~^Pi{xi,X2) V Po(x2,X3) 

-^Qi(x2) V ^Pi(a:i,a;2) 



We do not know whether an alternative form of the separation rule could help. 

Noteworthy about fluted logic and the proposed method is that, in order to 
establish an upper bound on the number of variables in derived clauses, a truly 
dynamic renaming rule is needed (namely separation). Though renaming is a 
standard technique for transforming formulae into well-behaved clausal classes, 
it is usually applied in advance, see for example [7,15]. From a theoretical point 
of view whenever it is possible to do the renaming transformations as part of 
preprocessing, it is sensible to do so. The above example illustrates what could 
go wrong otherwise. It should be added though that there are instances where 
renaming on the fly is useful [27] . For fluted logic it is open whether there is a res- 
olution decision procedure which does not require dynamic renaming. Going by 
the experience with other solvable classes, for example, Maslov’s class K [5,14], 
where renaming is only necessary when liftable ordering refinements are used, 
one possibility for avoiding dynamic renaming may be by using a refinement 
which is based on a non-liftable ordering. However, it would seem that the prob- 
lems described in Section 6 are the same with non-liftable orderings. Even if it 
turns out that there is a resolution decision procedure which does not use sepa- 
ration, one could imagine that the separation rule can have a favourable impact 
on the performance of a theorem prover, for, with separation the size of clauses 
can be kept small, which is generally desirable, and for fluted logic separation is 
a cheap operation (Lemma 8) . 

As noted earlier, the splitting rule is not essential for the results of this paper. 
The separation rule already facilitates some form of ‘weak splitting’, because, if C 
and D are variable disjoint and non-ground subclauses of (7 V I? then separation 
will replace it by g V (7 and V D, where g is a new propositional symbol. 
A closer resemblance to the splitting rule can be achieved by making q minimal 
in g V (7 and selecting ^q in ^q V D. Nevertheless, splitting has the advantage 
that more redundancy elimination operations are possible, for example forward 
subsumption. 

The realisation of a practical decision procedure for fluted logic would require 
a modest extension of one of the many available first-order theorem provers which 
are based on ordered resolution with an implementation of the separation rule. 
Modern theorem provers such as Spass [28] are equipped with a wide range of 
simplification rules so that reasonable efficiency could be expected. 
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Abstract. ZRes is a propositional prover based on the original proce- 
dure of Davis and Putnam, as opposed to its modified version of Davis, 
Logeman and Loveland, on which most of the current efficient SAT 
provers are based. On some highly structured SAT instances, such as 
the well known Pigeon Hole and Urquhart problems, both proved hard 
for resolution, ZRes performs very well and surpasses all classical SAT 
provers by an order of magnitude. 



1 The DP and DLL Algorithms 

Stimulated by hardware progress, many more and more efficient SAT solvers have 
been designed during the last decade. It is striking that most of the complete 
solvers are based on the procedure of Davis, Logeman and Loveland (DLL for 
short) presented in 1962 [11]. The DLL procedure may roughly be described 
as a backtrack procedure that searches for a model. Each step amounts to the 
extension of a partial interpretation by choosing an assignment for a selected 
variable. The success of this procedure is mainly due to its space complexity, 
since making choices only results in simplifications. However, the number of 
potential extensions remains exponential. Therefore, if the search space cannot 
be pruned by clever heuristics, this approach becomes intractable in practice. 

The picture is very different with DP, the original Davis-Putnam algorithm 
[3]. DP is able to determine if a propositional formula /, expressed under con- 
junctive normal form (CNF), is satisfiable or not. Assuming the reader is familiar 
with propositional logic, DP may be roughly described as follows [9]: 

I. Choose a propositional variable x of /. 

II. Replace all the clauses which contain the literal x (or ^x) by all 
binary resolvents (on x) of these clauses (cut elimination of x), 
and remove all subsumed clauses. 

III. a. If the new set of clauses is reduced to the empty clause, then the 

original set is unsatisfiable. 

b. If it is empty, then the original formula is satisfiable. 

c. Otherwise, repeat steps I-III for this new set of clauses. 
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As opposed to DLL, DP avoids making choices by considering the two pos- 
sible instantiations of a variable simultaneously. It amounts to a sequence of cut 
eliminations. Since the number of clauses generated at each step may grow ex- 
ponentially, it is widely acknowledged in the literature as inefficient, although no 
real experimental study has been conducted to confirm this point. Dechter and 
Rish [5] have first pointed out some instances for which DLL is not appropriate 
and where DP obtains better results. A more comprehensive experimentation 
has been conducted in [2], to evaluate DP on a variety of instances. If this study 
confirms the superiority of DLL on random instances, it also shows that for 
structured instances, DP may outperform some of the best DLL procedures. 
Substantial progress for DLL are due to better heuristics. For DP, significant 
improvements are possible thanks to efficient data structures for representing 
very large sets of clauses. 

Several authors have pointed out that resolution-based provers (like DP and 
DLL) are intrinsically limited, since they have found instances that require an 
exponential number of resolution steps to be solved (e.g. [9,10,14]). This is the 
case for the Pigeon Hole [10] and for the Urquhart problem [14]. They suggest 
that more powerful proof systems have to be used practically to solve such prob- 
lems efficiently. However, all these results are based on the implicit hypothesis 
that successive resolutions in DP and DLL are performed one by one. 

This paper presents the ZRes system, which is an implementation of the DP 
algorithm that is able to perform several resolutions in a single step. As a result, 
ZRes is able to solve instances of such hard problems much more efficiently than 
the best current DLL provers. 



2 Reviving DP 

The crucial point of DP is step II, which tends to generate a very large number of 
clauses. Eliminating subsumed clauses at this step induces a significant overhead 
but still pays off. 



Efficient Data Structures. In [2], Trie structures are used to represent sets of 
clauses. Until now, they seem to remain the state-of-the-art data structures for 
subsumption checking [15,4]. Tries allow the factorization of clauses beginning in 
the same way, according to a given order on literals. In ZRes we further general- 
ize this principle, to allow the factorization of the end as well as the beginning of 
clauses simultaneously. Sets of clauses are thus represented by means of directed 
acyclic graphs (DAG) instead of trees (by Tries) . 

Using DAG to represent a boolean formula has been intensively investigated 
in many works on binary decision diagrams (HDD) [1,7]. Many variants of HDD 
have been proposed but all attempt to compute the HDD encoding of the for- 
mula, expressed in Shannon normal form. From the SAT point of view and since 
the resulting HDD characterizes the formula validity and satisfiability, this con- 
struction is de facto more difficult than testing for satisfiability. 
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The approach followed by ZRes is quite different since, instead of computing 
the Shannon normal form, we use here BDD-like structures only to represent sets 
of clauses. Practically, the set of clauses is represented by a ZBDD (a variant 
of BDD [12]), proved useful for manipulations of large sets. Sets are represented 
by their characteristic (boolean) functions and basic set operations can thus be 
performed as boolean operations on ZBDD. Moreover, it has been shown that the 
size of such a ZBDD is not directly related to the size of the corresponding set. 
Since the cost of basic set operations only depend on the size of the considered 
ZBDD, this hints to performing the cut elimination step of the DP algorithm 
directly at the set level. 

Another way to represent / in the cut elimination step of DP is to factorize 
X and among its clauses. The formula / can then be rewritten as {x\J f^) /\ 
{^x V /“) A fxf, where /+ (resp. /“) is the CNF obtained from the set of 
clauses containing x (resp. ~^x), after factorization, and where /a,/ denotes the 
set of clauses containing neither x nor ^x. The second step of the algorithm 
then amounts to put the formula (/+ V /“) A fxf into CNF. This can be done 
in 3 stages. First, distribute the set of clauses /+ over /“. Second, eliminate 
tautologies and subsumed clauses from the resulting clauses. Third, compute 
the union of the remaining clauses with those of fxf, while deleting subsumed 
clauses. The two first stages could be performed successively, using standard 
operations on ZBDD. However, the ZBDD used in ZRes have a special semantics 
and thus, a more efficient algorithm, called clause-distribution, can be designed. 
This operation guarantees that, during the bottom-up construction of the result, 
each intermediate ZBDD is free of tautologies and subsumed clauses. Tautologies 
are eliminated on the fly and subsumed clauses are deleted by a set difference 
operation, at each level. Similarly, in the third stage of the cut elimination, 
subsumed clauses may be deleted while computing the union of the two sets of 
clauses. This new algorithm takes full advantage of the data structure used to 
represent sets of clauses. 



3 Experimental Results 

ZRes^ is written in C, using the Cudd package [13] which provides us with basic 
ZBDD operations as well as useful dynamic reordering functions. We have tested 
ZRes on two classes of hard problems for resolution: Hole and Urquhart. Those 
tests have been performed on a Linux Pentium-!! 400MHz^ with 256MB. Our 
results are compared, when possible, with those of two DLL implementations: 
Asat [8], which is a good-but-simple DLL implementation, and Sato 3.0 [15], 
which includes many optimizations and recent improvements, such as backjump- 
ing and conflict memorization. Cpu times are given in seconds. We assume that 
an instance that cannot be solved in less than 10000 seconds counts for 10000. 

^ ZRes is available at http://www.lri.fr/~simon/research/zres. 

^ On the DIMACS [6] machine scale benchmark, our tests have granted this machine 
a user time saving of 305%, in comparison with the SparclO.41, given as a reference. 
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The Hole Problem. The following table describes the obtained results on 
different instances of the Hole problem by Asat, Sato and ZRes. For ZRes, 
the std and cd columns describe respectively the time obtained without (resp. 
with) the clause-distribution operation. 



Instances 
Hole-09 
Hole- 10 
Hole- 11 
Hole-12 
Hole-20 
Hole-30 
Hole-40 



Var. Nb. 
90 
110 
132 
156 
420 
930 
1640 



Cl. Nb. 
415 
561 
738 
949 
4221 
13981 
32841 



Asat 

11.94 

141.96 

1960.77 

10000 



Sato 

8.90 

80.94 

7373.65 

10000 



ZRes std 
L87 
3.26 
5.76 
10.18 
654.8 
10000 



ZRes cd 

“ToT 

1.61 

2.65 

4.06 

69 

1102 

9421 



ZRes clearly surpasses both Asat and Sato. While DLL algorithms can’t 
solve instances of Hole-n for n > 11, ZRes manages to solve much larger in- 
stances^. As we can see, the speedup induced by the clause-distribution operation 
is significant on such instances. 

Our experiments have shown that this problem is very sensitive to the heuris- 
tic function used to choose the cut variable. Surprisingly, the best results were 
obtained using a heuristic function that tends to maximize the number of clauses 
produced by the cut. Other heuristics, such as in [2], did not allow to solve those 
Hole instances. 



The Urquhart Problem. Urquhart has described a class of problems based 
on properties of expander graphs [14]. Actually, each Urq-n is a class of problems 
where the number of clauses and variables is not fixed but only bounded to a 
specific interval. In the last table, MnV (resp. MnC) denotes the mean number 
of variables (resp. clauses) for a set of instances of a given class. Contrary to Hole, 
Urquhart problem does not seem sensitive to the heuristics used. The results of 
Asat and Sato on 100 Urq-3 instances attest the hardness of these problems: 



System 


Total cpu time 


# resolved 


Mean cpu time (resolved) 


Asat 


404 287 


69 


1366 


Sato 


776 364 


26 


1398 


ZRes cd 


69.2 


100 


0.69 



Solving instances for greater values of n seems out of the scope of these 
systems. On the other hand, ZRes performs quite well on such instances. The 
following table gives the mean time on 1000 instances for greater values of n. 
Note that we do not give the std time because the speedup due to the clause- 
distribution operation is not relevant for this problem. 

® We even solved the Hole-55 instances, with 3080 variables and 84756 clauses, in less 
than 2 days. 
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Instances MnV MnC Mean cpu-time 



Urq-3 


42 


364 


0.57 


Urq-4 


77 


705 


1.72 


Urq-5 


123 


1143 


4.25 


Urq-6 


178 


1665 


8.88 


Urq-7 


242 


2299 


16.5 


Urq-8 


317 


3004 


29.6 


Urq-9 


403 


3837 


48.8 



About the Compression Power of ZBBD. The previous examples illustrate 
quite well that DP, associated with the ZBDD encoding of sets of clauses, may 
in some cases be more effective than DLL. The experiment on the Hole-40 shows 
that for some of the cut eliminations, the number of clauses corresponding to 
and f~ may exceed 10®°. Clearly, the result of such a cut could not be computed 
efficiently without using such an extremely compact data structure. The ability 
of ZBDD to capture redundancies in sets of clauses suits the DP algorithm 
particularly well. Indeed, additional redundancies are produced during each cut 
elimination, when each clause of /+ is merged with each clause of /“. Moreover, 
unlike in random instances, we think that such redundancies may also be found 
in structured instances, corresponding to real-world problems. 

In order to appreciate the compression power of ZBDD structures, it is in- 
teresting to consider level of compression which may be characterized by the 
ratio nb of literals /nb of nodes. We have recorded its successive values on 1000 
random 3-SAT instances of 42 variables and 180 clauses, on wich DP is known 
to be a poor candidate [2]. For such instances, the initial value of the ratio is 
about 2, then it increases up to 6.15 and eventually decreases together with the 
number of clauses. In contrast, on Hole-40, this ratio varies from 10 to more 
that 10°°. Similarly, on Urq-10 it may exceed 10°^. Pigeon and Urquart classes 
however correspond to extreme cases. We have also tested ZRes on some other 
instances of the SAT DIMACS base [6]. Results are particularly interesting. For 
some instances the compression level is much more important than for random 
instances (more than 10®), while on others, like ssa or flat — 50 it is very close to 
that of random instances. It is however striking that the latter instances, even if 
they correspond to concrete problems, have been generated in a random manner. 
This seems to confirm that our hypothesis (structured problems generally have 
regularities) is well founded. 

4 Conclusion 

Dechter and Rish [5] were the first to revive the interest in the original Davis- 
Putnam procedure for SAT. But DP also proves useful for knowledge compilation 
or validation techniques, which was our initial motivation [2] . The introduction 
of ZBDD brings significant improvements allowing ZRes to deal with huge sets 
of clauses. It leads us to completely reconsider the performances of the cut elimi- 
nation, which can be performed independently of the number of handled clauses. 
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It is thus able to solve two hard problems out of the scope of other resolution- 
based provers. Although such examples might be considered somewhat artificial, 
their importance in the study of the complexity of resolution procedures must 
not be forgotten. On other examples, such like DIMACS ones, results are not 
so good, but important compression level, due to ZBDD, can be observed on 
real-world instances. The strength of ZRes definitely comes from its ability to 
capture regularities in sets of clauses. Although it has a no chance to compete on 
random instances, which lack such regularities, it might be a better candidate 
for solving real-world problems. 

Further improvements are possible. The Hole example pointed out that stan- 
dard heuristics for DP are not always appropriate for ZRes. We are studying 
new heuristics, based on the structure of ZBDD rather than on what they repre- 
sent. One may also investigate more adapted reordering algorithms, which take 
advantage of the particular semantics of the ZBDD used in ZRes. Eventually, 
DP and DLL may be considered as complementary approaches. An interesting 
idea is to design an hybrid algorithm integrating both DP and DLL in ZRes. 
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Abstract. In this paper we describe the MBase system, a web-based, 
distributed mathematical knowledge base. This system is a mathemati- 
cal service in Math Web that offers a universal repository of formalized 
mathematics where the formal representation allows semantics-based re- 
trieval of distributed mathematical facts. 



1 Introduction 

Around 1994, an anonymous (but well-known) group of authors put forward 
the “Qed Manifesto” [QED95], which advocates building up a mathematical 
knowledge base (and supporting software systems) as a kind of “human genome 
project” for the deduction community. Unfortunately, the vision has failed to 
catch on in spite of a wave of initial interest. In our view this is largely due to 
the lack of supporting software, as well as to the ensuing debate on the “right” 
logical formalism. 

In this paper we describe the MBase system, a web-based mathematical 
knowledge base (see http://www.mathweb.org/mbase). It offers a the infras- 
tructure for a universal, distributed repository of formalized mathematics. Since 
it is independent of a particular deduction system and particular logic^, the 
MBase system can be seen as an attempt to revive the Qed initiative from an 
infrastructure viewpoint. The system is realized as a mathematical service in the 
MathWeb system [FK99], an agent-based implementation of a mathematical 
software bus for distributed theorem proving. 

We will start with a description of the system from the implementation point 
of view in the next section (we have described the data model and logical issues 
in [KFOO]). In section 3, we will take a brief look at the interface protocols based 
on the OpenMath and Kqml standards (see [FHJ+99,Koh00]). This reliance of 
Internet standards for communication makes MBase an open system, and the 
implementation presented in this paper just one of its possible instances. 

2 Architecture 

The MBase system is realized as a distributed set of MBase servers (see hg- 
ure 1). Each MBase server consists of a Relational Data Base Management 

^ See [KFOO] for the logical issues related to supporting multiple logical languages 
while keeping a consistent overall semantics. 
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Fig. 1. System Architecture 



System (RDBMS) e.g. Oracle connected to a mOZart process (yielding a 
Math Web service) via a standard data base interface (in our case JDBC). For 
browsing the MBase content, any MBase server provides an http server (see 
http://mbase.mathweb.org: 8000 for an example) that dynamically generates 
presentations based on HtML or Xml forms. 

This architecture combines the storage facilities of the RDBMS with the flexi- 
bility of the concurrent, logic-based programming language Oz [Smo95] , of which 
mOZart is a distributed implementation (see http://www.mozart-oz.org). 
Most importantly for MBase, mOZart offers a mechanism called pickling, 
which allows for a limited form of persistence: mOZart objects can be efficiently 
transformed into a so-called pickled form, which is a binary representation of the 
(possibly cyclic) data structure. This can be stored in a byte-string and efficiently 
read by the mOZart application effectively restoring the object. This feature 
makes it possible to represent complex objects (e.g. logical formulae) as Oz data 
structures, manipulate them in the mOZart engine, but at the same time store 
them as strings in the RDBMS. Moreover, the availability of “Ozlets” (mOZart 
functors) gives MBase great flexibility, since the functionality of MBase can 
be enhanced at run-time by loading remote functors. For instance complex data 
base queries can be compiled by a specialized MBase client, sent (via the In- 
ternet) to the MBase server and applied to the local data e.g. for specialized 
searching (see [Duc98] for a related system and the origin of this idea) . 

MBase supports transparent distribution of data among several MBase 
servers (see [KFOO] for details). In particular, an object O residing on an MBase 
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server S can refer to (or depend on) an object residing on a server S^; a query 
to O that needs information about will be delegated to a suitable query to 
the server S\ We distinguish two kinds of MBase servers depending on the data 
they contain: archive servers contain data that is referred to by other MBases, 
and scratch-pad MBases that are not referred to. To facilitate caching proto- 
cols, MBase forces archive servers to be conservative, i.e. only such changes 
to the data are allowed, that the induced change on the corresponding logical 
theory is a conservative extension. This requirement is not a grave restriction: 
in this model errors are corrected by creating new theories (with similar presen- 
tations) shadowing the erroneous ones. Note that this restriction does not apply 
to the non-logical data, such as presentation or description information, or to 
scratchpad MBases making them ideal repositories for private development of 
mathematical theories, which can be submitted and moved to archive MBases 
once they have stabilized. 

3 Interfaces 

The primary interface language of MBase is the XML-based markup language 
OMDoc [KohOO], a document-centered extension of the emerging OpenMath 
standard [CC98] for mathematical objects. For instance the definition of a double 
function would be of the following form. 

<definition id='' double. def item=''double . sym” type=''simple''> 

<CMP xml : lang=''eng''>The doubling function defined by addition</CMP> 
<FMP><0M0BJ><0MBIND> 

<0MS cd=''stlc” name=''lambda''/> 

<0MBVARX0MV name=''X''/X/OMBVAR> 

<0MAX0MS cd=''arithl'' name=''plu-s''/XOMV name=''X''/XOMV name=''X''/X/OMA> 
</0MBINDX/0M0BJX/FMP> 

</def inition> 

The CMP (commented mathematical property) element gives an informal char- 
acterization of the definition (which is a simple definition for the symbol with the 
identifier double. sym according to the attributes to the definition element) 
and the FMP (formal MP) gives the defining A-term \X.(+XX) in OpenMath 
representation. Note that the question of the semantics of such a term is de- 
termined by that of the symbols A and -f. These are specified in the MBase 
theories given in the cd attributes of the OMS elements (the name of the symbol 
together with the theory establish unique reference in MBase) 

As a consequence of the XML-based approach it is possible to generate 
other logical formats from OMDoc by specifying simple XsL [Dea99] style 
sheets; in fact the transformation from OMDoc to the input formats of the 
flMEGA [BCF+97] and InKa [HS96] theorem provers is realized this way. It 
should be an easy exercise for most other concrete input formats. Furthermore 
one can generate customized OMDoc documents from MBase, which can then 
be presented in one of the more standard presentation media (e.g. DljjX. or 
HtML/MathMl). 
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Generating OMDoc from a reasoning system is also quite simple in practice, 
since OMDoc has a relatively simple structure (fully specified in an Xml doc- 
ument type definition [KohOO]) that closely follows the term structure of Open- 
Math (using the OMS, OMV, OMA, 0MB IND elements to describe formula trees made 
up of symbols, variables, applications and abstractions). 

4 Conclusion, Evaluation, and Future Work 

We have described the MBase system, a distributed mathematical knowledge 
base, it can be obtained from http://www.mathweb.org/mbase. This system 
differs from other repositories of mathematical data such as the Isabelle [Isa] 
or Pvs [PVS] libraries in that it is an independent system not tied to a particular 
deduction system and offers inference services (matching, type-computation,. . . ). 
The data format is not geared towards a particular application. 

It is currently used by the IImega and InKa theorem provers for storing 
and sharing logical theories including theorems, definitions, tactics and meth- 
ods. In particular, the MBase service can be used as an ontology server fixing 
the semantics of mathematical objects used in protocols for deduction system in- 
tegration. Furthermore, the MBase system is used as the basis of an interactive 
personalized mathematics book (IDA [CCS99]). Here, the structure information 
contained in the MBase version of the IDA data can be used to generate in- 
dividualized sub-documents of IDA on the fly. While in the first case study the 
logical formulation of mathematical data is in the center of interest, in the second 
application textual representation plays a much more prominent role. MBase 
supports both formats and even fosters their integration. 

The current implementation uses the very simple file-based gdbm database 
system. This is sufficient for the amount of data currently available in IImega, 
InKa and IDA. Furthermore it offers a very flexible, open and portable pro- 
gramming base. A version of MBase that uses Oracle is currently under de- 
velopment. 

Here a comparison to the MDB system [Har97] developed at the University 
of Erlangen is in order. MDB aims at supplying database support for the MiZAR 
libraries, and is based on an object-oriented extension of Oracle. Unfortunately, 
already the first 13 (of more than 300) articles already need 500 MB disc space in 
Oracle. Our division of labor that treats logical formulae in the programming 
language mOZart and relational, text and structural data in a DBMS pays off 
here. The size of the data base is only one order of magnitude larger than the size 
of the OMDoc encoding, which is comparable in size to the encodings used e.g. 
in Hmega, Isabelle, or Pvs. As an example for relative sizes of representations 
in MBase we consider the core theory library of IImega and the IDA text: 



Relative sizes of representations in MBase (MB) 


System 


native 


OMDoc 


MBase 


Hmega 


0.61 (POST) 


1.5 


4.2 


IDA 


4.2 (DTeX) 


5.0 


9.3 
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Even when MBase implementations based on industrial strength relational data 
base systems like e.g. Oracle are available, we believe that the current gdbm- 
based implementation can still serve as a local development knowledge base and 
“proxy” system to ease the load on the central MBase repository servers. Such 
a local system will probably also be better suited to support the operations 
necessary for changing definitions and axiomatizations during the development 
of a theory. 

In the current version, we have not yet treated more advanced structuring 
concepts like theory morphisms, inheritance wrt. signature mappings, etc. that 
have been developed for structuring the knowledge base (see [KFOO]). There 
remains much to be done in this direction, and we hope to adopt techniques 
from algebraic specification (see for instance [Hut99]). 
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Abstract. The Tramp system transforms the output of several auto- 
mated theorem provers for first order logic with equality into natural 
deduction proofs at the assertion level. Through this interface, other 
systems such as proof presentation systems or interactive deduction sys- 
tems can access proofs originally produced by any system interfaced by 
Tramp only by adapting the assertion level proofs to their own needs. 



1 Introduction 



Today’s theorem proving systems (automatic and interactive ones) have reached 
a considerable strength. However, it has become clear that no single system is 
capable of handling all sorts of deduction tasks. Therefore, it is a well-established 
approach to delegate subgoals to other (specialist) systems such as automated 
theorem provers (ATPs). Unfortunately, most ATPs use their own particular 
formalism. These machine-oriented formalisms make the proofs difficult to read. 
Hence, in order to make use of the results of the ATPs other systems need to 
adapt the output of an ATP to input, that they can further process. To minimize 
the transformation efforts it is advisable to use an interface that transforms the 
machine- found proofs of various formalisms into a uniform format. 

Thereby, interactive deduction systems or proof presentation systems need 
a uniform format that they can easily transform into a presentation compre- 
hensible to humans. Hence, a uniform format suitable for such systems should 
consist of intuitive steps and should be compact. Some approaches transform the 
machine- found proofs into natural deduction (ND) proofs [2,12]. But the result- 
ing ND proofs suffer from the problem that they usually consist of a large number 
of low-level steps which are pure-syntactic manipulations of logical quantifiers 
and connectives. An approach to enhance these problems is to produce ND proofs 
at the assertion level [8]. The assertion level allows for human-oriented macro- 
steps justified by the application of theorems, lemmas, or definitions which are 
collectively called assertions. For instance, the assertion level step 

FCG cGF, 



cGG 



-DEFC 



derives the conclusion c G G by an application of the subset definition DEF C — 
formalized by VS'i.VS'2.(5'i C S 2 ^ Va:.(a; G a; G S 2 )) — from the premises 
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c G F and F C G. A corresponding basic ND proof consists of a whole sequence 
of basic ND steps. Other work indicates that assertion level proofs are well suited 
as basic representation level for further human-oriented proof presentation [7] . 

In the following we describe the Tramp system which can transform the 
output of several ATPs for first order logic with equality into ND proofs at the 
assertion level. Moreover, we give an example of an assertion level proof produced 
by Tramp and discuss current applications of Tramp and potential extensions. 

2 The Tramp System 

Tramp consists of three parts: (1) For each ATP interfaced by Tramp there is a 
minor transformation process that can transform a problem description (consist- 
ing of a set of first order formulas, the assumptions, and one first order formula, 
the conclusion) into input suitable for this ATP. (2) At the heart is the transfor- 
mation process that can transform a problem description and the corresponding 
output of an ATP into an ND proof at the assertion level. (3) A communication 
shell handles the access of the ATPs by Tramp and the way other systems can 
reach Tramp. 

The transformation processes producing the inputs for the ATPs work all 
in the same manner: compute the clause normal form of the formulas of the 
problem description; then use these clauses to create an input file for an ATP. 
In the following we focus on the transformation of proofs at the assertion level 
and the integration in a networked proof development environment. 

2.1 Proof Transformation at the Assertion Level 

The transformation process takes as input a problem description and the corre- 
sponding output of an ATP. It produces an ND proof at the assertion level. The 
proof is created through three subprocesses, all embedded into one nutshell: 

Structuring and Transforming into Refutation Graphs: The output of 

the ATP is structured by cutting off lemmas from the main proof. The 
resulting proof parts of the (remaining) main problem and the lemmas are 
each transformed into refutation graphs (refutation graphs are ground clause 
graphs representing refutation proofs [5]). 

Transformation at the Assertion Level: Each refutation graph is transfor- 
med into an ND proof at the assertion level. These proofs are connected via 
the lemmas such that we obtain a single ND proof at the assertion level. 
Optional Expansion of Assertion Steps: If requested by the user each as- 
sertion application can be expanded to a sequence of basic ND steps such 
that the resulting proof is a basic ND proof. 

We enriched this basic transformation procedure with several heuristics, pro- 
ducing especially short and comprehensible proof parts and avoiding indirect 
parts. By structuring and transforming the output of the ATPs into refutation 
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graphs we obtain an intermediate uniform representation of the different for- 
malisms of the ATPs. To integrate further ATPs we require only appropriate ex- 
tensions of this subprocess. We prefer refutation graphs as intermediate uniform 
representation because proofs in many other refutation based formalisms can 
be transformed easily into refutation graphs (e.g., a transformation algorithm 
for resolution proofs is described in [5]). Furthermore, the correspondences be- 
tween the input clauses (which literals are contradictory?) are directly visible. 
The second subprocess consists of two phases: First, Tramp decomposes the 
assumptions, the conclusion, and the refutation graphs until refutation graphs 
are reached consisting only of a sequence of steps which represent translatable 
assertion applications (Tramp identifies assertion applications already in the 
refutation graphs). Then, in a second phase these refutation graphs are trans- 
formed by translating these steps successively into corresponding assertion ap- 
plications in the ND proof. A detailed description of this algorithm which is an 
extension of [9] to refutation graphs with equality can be found in [11]. 



2.2 Integration in a Networked Proof Development Environment 

All transformation processes are implemented in Allegro Common Lisp and run 
in one lisp process. This lisp process runs within a MathWeb communication 
shell[6j. MathWeb is a system for distributed automated theorem proving. Ex- 
isting tools are equipped with a communication shell and are integrated into 
a networked proof development environment. Via MathWeb, Tramp can be 
reached by other MathWeb services and can reach the ATPs which are also 
available as MathWeb services. 

As input Tramp accepts problem descriptions in VOST syntax [1]. More- 
over, the user can feed Tramp directly with the corresponding output of an 
ATP or can instruct Tramp to access several ATPs to prove the input problem. 
Currently, Tramp is able to produce the input and process the output of the 
ATPs Spass, Bliksem, Otter, WaldMeister\ ProTeIn, and EQP (see [14] 
for references). When instructed to access ATPs Tramp computes the inputs for 
the chosen ATPs and distributes these inputs via MathWeb among the ATPs. 
The distributed ATPs run competitively. When an ATP is finished it sends its 
output via MathWeb back to Tramp which transforms the output into an ND 
proof at the assertion level. Thus, when instructed to access an ATP, Tramp 
behaves for an external user like an ND-ATP. 

As output Tramp produces again VOST syntax. A created ND proof is 
expressed in the linearized version first used in [2]. Thereby, a ND line consists 
of a finite set of formulas A, called the hypotheses, a single formula F, called the 
conclusion, and a justification (7^). Such a line is denoted as: L.A h F (JZ) where 
A is a label for this line. Our set of basic ND rules is based on Gentzen’s natural 
deduction calculus NK, but is enriched with further derived rules to obtain 

^ Except WaldMeister all ATPs interfaced by Tramp are refutation based. How- 
ever, Tramp can transform the output of WaldMeister into refutation graphs by 
deriving a contradiction between the proved theorem t = t' and its negation t ^ t' . 
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better comprehensible and more compact proofs. The justification ‘application 
of assertion L_4 on premises Lp^, ... ,LpJ is written as Lp^ ... Tp„). 

3 An Example 

We apply Tramp on the problem SETOOl of the TPTP problem library [13]. 
The input for Tramp is the following problem description: 

Assumptions: VS'i.VS'2.((5'i = S' 2 ) ((S'! C S 2 ) A (52 C 5i))) 

V5i.V52.Va;.((a; G Si) A {Si C ^ 2 ) ^ (a; G 52 )) 

Conclusion: V5i.V52.Va;.((a; G 5i) A (5i = S 2 ) (a; G 52)), 

Tramp computes the input for an ATP and applies the ATP via MathWeb. 
Then, it transforms its output into the following refutation graph G: 




ski , sk2 , sfca are skolem constants. Afterwards, Tramp transforms G together 
with the input problem description into the following assertion level ND proof: 
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L2. 


L2 


hVSi.VS 2 .Vx.((a; G Si) A (Si C S 2 ) => (x G S 2 )) 


(Hyp) 


Li- 


L3 


\-{cG F) A {F = G) 


(Hyp) 


Li. 


Li 


\-cG F 


(aE Li) 


Ts. 


Li 


\-F = G 


(aE Li) 


Le. 


Li, Li 


\-F CG 


(Ll Li) 


L7. 


Li, L 2 , Li 


hcG G 


{L2 Lq Li) 


Ls. 


Ll, 1/2 


h (c G E) A (E = G) (c G G) 


(^ I Li Lr) 


Lg. 


Ll, 1/2 


\-Sx.{{x G E) A (E = G) ^ (® G G)) 


(V7 Ls) 


Lio. 


Ll, 1/2 


hVS 2 .Va;.((® e E) A (E = S 2 ) => (* G S 2 )) 


(V7 Eg) 


Til. 


Ll, 1/2 


hVSi.VS 2 .Vx.((a; G Si) A (Si = S 2 ) ^ (® G S 2 )) 


(V7 Eio) 



In this proof the lines Lq and L7 are justified by assertion applications. 



4 Experience, Discussion, and Future Work 

We have tested the implementation of Tramp on about 100 examples from the 
TPTP problem library which can be proved by the ATPs interfaced by Tramp. 
Furthermore, Tramp is used permanently by the systems ProVerb [10] and 
IOMEGA [3]. ProVerb, a proof presentation system, uses Tramp to obtain as- 
sertion level proofs that it can translate into natural language proofs. 17mega, 
an interactive mathematical assistant system, calls the ATPs via Tramp on 
open goals in its proof object. Then 17mega integrates the proofs provided by 
Tramp into its own proof. 

Our experiments show that, for a problem containing some applicable as- 
sertions, the length of the assertion proof is typically about half the length of 
the basic ND proof that results from expanding the abstract assertion steps 






464 Andreas Meier 



(e.g., the assertion proof in Sec. 3 consists of 11 lines whereas the corresponding 
basic ND proof consists of 21 lines). Although pure equality proofs such as pro- 
duced by EQP and WaldMeister are neither shortened nor abstracted by using 
Tramp, we use these systems to push the solvability horizon of ATPs interfaced 
by Tramp. Not unexpectedly, transforming proofs to the assertion level requires 
considerable computational effort for larger proofs. From our experience, this ef- 
fort is justified for systems that need a human-oriented representation of the 
machine-found proofs such as interactive deduction systems and proof presenta- 
tion systems. Such systems can adapt easily the resulting assertion level proofs 
to their own needs for the following reasons: (1) The resulting assertion level 
proofs are significantly more compact than basic ND proofs. (2) They contain 
meaningful steps but hardly indirect parts. (3) Each assertion application can 
be expanded to a sequence of basic ND steps when a more detailed derivation 
is needed. Otherwise, for the communication between fully automatic systems 
other uniform representations suitable for this task can be produced with less 
effort (e.g., by using refutation graphs directly). 

We are currently working on an extension of Tramp to handle proofs found 
by LEO [4], a higher order resolution prover with built-in extensionality. 

The current version of Tramp is available at 
http://www.ags.uni-sb.de/~ajneier/trajnp.html. Soon there will be also a 
web interface for Tramp. 
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Abstract. We give a resolution- based procedure for deciding unifiability 
in the variety of bounded distributive lattices. The main idea is to use 
a structure-preserving translation to clause form to reduce the problem 
of testing the satisfiability of a unification problem S to the problem 
of checking the satisfiability of a set <l>s of (constrained) clauses. These 
ideas can be used for unification with free constants and for unification 
with linear constant restrictions. Complexity issues are also addressed. 



1 Introduction 

From an algebraic point of view, unification can be seen as solving (systems of) 
equations in the initial or free algebra of an equational theory. Apart from its the- 
oretical interest, unification is used e.g. in resolution-based theorem proving and 
in term rewriting to deal with certain equational axioms (such as associativity 
and commutativity). The unification problem has been thoroughly studied for 
equationally defined theories characterized by axioms such as associativity, com- 
mutativity, distributivity, associativity-commutativity, associativity-commutati- 
vity-idempotency; and for several theories related to algebra (Abelian groups, 
commutative and Boolean rings, semilattices. Boolean algebras, primal algebras, 
discriminator varieties). For details cf. [5] and the bibliography cited there. The 
combination of unification algorithms has been studied in [4] . 

In this paper we present some results on unification in the equational theory 
of bounded distributive lattices. The study was motivated, on the one hand, by 
our interest in distributive lattices with operators, and, on the other hand, by 
the fact that unification problems in semilattice- and lattice-based structures are 
becoming of increasing interest in computer science (we mention e.g. the results 
of Baader and Narendran on unification of concept terms in description logics 
[2] ; similar possible applications in set constraints may also be of interest) . 

It is known that the class Dqi of bounded distributive lattices has an unde- 
cidable first-order theory, but both its universal theory and its positive V3 theory 
(hence the unification problem with free constants) are decidable. Unification for 
distributive lattices has only been addressed in a few papers. In [12], Gerhard 
and Petrich give a criterion for unifiability (with free constants) of two terms in 
the theory of distributive lattices. (We were not able to generalize the argument 
used in the proof of this result to handle conjunctions of equations.) Then, in 
the attempt to give a basis set for all unifiers of two terms, they considered 
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terms containing only one of the lattice operations V or A and, for more general 
terms, only particular cases, containing few variables. The results of Ghilardi [13] 
show that the equational class D of distributive lattices has unification type 0, 
i.e. there exist D-unification problems with no minimal complete set of unifiers. 
We are not aware of any other results on unification for distributive lattices, 
e.g. concerning its complexity. Due to the interaction between operators, neither 
the ideas used in [19] for distributive unification, nor the results in [4] on the 
combination of unification algorithms can be applied in this case. 

In [20] , we gave a resolution-based decision procedure for the universal theory 
of certain varieties of distributive lattices with operators. The arguments in [20] 
cannot be used for the positive V3 theory of such varieties without modification. 
In this paper we further develop the ideas in [20] and show that the use of 
the Priestley representation for bounded distributive lattices allows us to give 
an algorithm based on resolution (with constrained clauses) for the unification 
problem in Dqi. The algorithm consists of the following steps: 

1. Structure-preserving translation to clause form: testing the satisfiability of a 
unification problem S is reduced to the problem of checking the satisfiability 
of a set (Ps of clauses. Expressing (I>s as a set of constrained clauses further 
simplifies the representation of the problem. 

2. Ordered resolution with selection for (constrained) clauses is used for testing 
the satisfiability of d>s- 

We also show that similar ideas can be used for unification with linear constant 
restrictions [4]. These results complete and improve the results in [12]. 

The main advantage of our approach is that the structure-preserving translation 
to clause form makes it much easier to treat the unification problem for bounded 
distributive lattices, by using results in resolution theory. As a byproduct, using 
Prop. 5.6 in [3], our results show that resolution (for ground clauses without 
equality) can be used for deciding the positive theory of Dqi. 

It seems that many of the results in this paper can be extended without 
difficulties to other varieties in which the free algebras have a description similar 
to those in Dqi. This is the case for many subvarieties of the variety of Ockham 
algebras (bounded distributive lattices with a lattice antimorphism), such as, 
e.g., the variety of De Morgan algebras. For the sake of simplicity, in this paper 
we restrict our attention to the class of bounded distributive lattices only. 

The paper is structured as follows. Section 2 contains the background informa- 
tion needed in the paper. Section 3 contains generalities about the unification 
problem for bounded distributive lattices. In Section 4 we give a resolution-based 
algorithm for this problem, and an extension to unification with linear constant 
restrictions. Section 5 contains conclusions and plans for future work. 
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2 Preliminaries 

2.1 Algebra 

Let A be a signature and a : A ^ N an arity function. A A-algebra is a 
structure A = {A, {aA}a-es), where A is a non-empty set and for every a & S, 
a A '■ ^ A. We denote by Ts{X) the term algebra over S in the variables 

X. An equation is an expression of the form t\ = t 2 where ti,t 2 G Ts{X). A E- 
algebra A = (A, {aA\a^ i:) satisfies an equation ti = t 2 if and t 2 become equal 
for every substitution of elements in A for the variables. An equational class is 
the class of all algebras that satisfy a set of equations. If if is a set of equations 
in the signature E, then F^{X) := Te{X)/ =£; is the free algebra over X in the 
equational class of all algebras that satisfy E (where =_e is the A-congruence 
on Ts{X) generated by E). A system of equations is a finite set of equations 
S : {si = ti,...,Sk = tk}, where Si,ti € Ts{X) for every 1 < i < fc. Let 
{j/i, . . . , y„} C A be the set of all variables in S. An algebra A = (A, {aA}<jG s) 
satisfies the existential closure, 3yi, . . .y„(si = A • • • A Sfc = tk), of S if there 
exists a map h : X ^ A such that h{si) = h{ti) for every 1 < z < fc, where 
h : Ts{X) ^ A is the unique homomorphism of A-algebras that extends h. 

2.2 Lattice Theory 

For the definition of partially-ordered set and order-filter we refer to [10]. If 
X = {X, <) is a partially-ordered set, we denote its set of order-filters by 0(X). 
There is a bijective correspondence between 0(X) and the set of all order- 
preserving maps from X to the partially-ordered set 2 = ({0,1}, <), where 
0 < 1. A structure L = (L, V, A), where L is a non-empty set and V and A are 
two binary operations on L is a lattice if V and A are associative, commutative 
and idempotent and satisfy the absorption laws. A distributive lattice is a lattice 
that satisfies either of the distributive laws. A lattice L = (L, V, A) has a first 
element if there is an element 0 € L such that 0 < x for every x € L; it 
has a last element if there is an element 1 S L such that a: < 1 for every 
X G L (where x<yiSxVy = y).A lattice having both a first and a last 
element is called bounded. In what follows, when we refer to bounded distributive 
lattices, the first and last element are supposed to be included in the signature. 
Thus, a bounded distributive lattice is a structure L = (L, V,A,0,1), where 
(L, V, a) is a distributive lattice and 0, 1 are constants such that 0 is first element 
and 1 last element in (L, V,A). We denote the equational class of all bounded 
distributive lattices by Dqi. Dqi contains e.g. the two-element bounded lattice, 
2 = ({0, 1}, V, A, 0, 1), where 0Vl = l,0Al = 0. 

2.3 Priestley Representation 

If L is a bounded distributive lattice, let D(L) := HorriDoi (L, 2) be set of all 0,1- 
lattice homomorphisms from L to the two-element bounded distributive lattice. 
The space D(L) = (D(L),<,r), where < is the pointwise ordering on maps 
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and r is the topology generated by all sets of the form Xa = {h & D{L) \ 
h{a) = 1} and their complements as a subbasis, is called the Priestley dual 
of L. Let Homp(D(L),2) be the lattice of all continuous and order-preserving 
maps between the ordered topological space D(L), and the two-element partially- 
ordered set 2 with the discrete topology. Priestley [18] showed that for every 
L e Doi, L is isomorphic to Homp(D(L), 2). In particular, if L is finite, then r 
is the discrete topology, so L is isomorphic to (0(D(L)), U, D, 0, DiJP)). 

The dual of a finite distributive lattice is much smaller and less complex than 
the lattice itself. Therefore, problems concerning finite distributive lattices are 
likely to become simpler when translated into problems about their duals. We 
illustrate this by comparing the free algebra in Dqi over a finite set C, Fd^^{C), 
and its Priestley dual D(Fd„i(C')). The theorem below is well-known. 

Theorem 1. Let C be a finite set. The following statements hold: 

(1) The map pc '■ <) (2*^, <) defined for every h G D{Fo^^{C)) 

by Pc{h) = h\c (the restriction of h : Fd„i(C') ^ 2 to C) is an order- 
isomorphism, where in both cases the order is defined pointwise. 

(2) The map pc ■ Fdoi{C) 0(2^, <) defined for every t G Tboi(C') tiy pc{t) = 
{f ■ C ^ {0, 1} I 7{t) = 1} (where for every f : C ^ {0, 1}, 7 : ^Doi(C') ^ 2 
is the unique extension of f to a 0,1-lattice homomorphism) is a lattice 
isomorphism. Its inverse is defined for every U G 0(2^, <) by pf)}{U) = 

V/6£/(A{c|/(c)=i} c). 

Every member of Tboi (C*) can be written as a finite join of finite meets of elements 
in C. Hence, Fdoi(C') is finite, and its number of elements is bounded by 2^'*^ . 
|.Pdoi(C')I has been computed only for small values of jCj. By Theorem 1(1), 
(H(Fd„i(C'))) is order-isomorphic to ('P(C'), C), hence has 2l'^l elements. 

The main idea of this paper relies on this remark. The relatively simple 
structure of D{Fo^^{C)) allows us to define a more efficient method for checking 
the satisfiability of unification problems with constants compared with methods 
that use the structure of FdoAC) and/or equational reasoning. 

2.4 Unification 

We present the definitions and results on E-unification needed in the paper. 

Definition 1. Let E be an equational theory, X its signature, and A a signature 
containing E. Let S : {si = ti,...,Sk = ffc} be a system of equations, where 
Si,ti G T/^(Y). Then S defines an E-unification problem over A. S is elementary 
iff A C E; S is an E-unification problem with (free) constants iff A\E is a set 
of constant symbols; and S is an E -unification problem with linear constant 
restrictions iff it is an E-unification problem with constants and, in addition, a 
linear ordering < on the variables and free constants occurring in S is given. In 
a general E-unification problem A\E may contain arbitrary function symbols. 
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Definition 2 . A unification problem S has a solution w.r.t. E if there is a 
substitution a :Y ^ T/\{Y) such that a(si) =e <j{ti) for every 1 < i < k. 

If S is an E -unification problem with linear eonstant restrictions, a solution 
for S is a substitution a :Y ^ T/\{Y) with the additional property that for every 
variable y GY and every eonstant c, if y < c then c does not occur in (j{y). 

In this context, one can study decidability of unifiability, the existence of unifiers, 
their classification according to “generality”, or the possibility of determining 
minimal sets of unifiers which are complete, in the sense that all other unifiers 
are less general. In this paper we focus on testing unifiability. This is sufficient in 
many applications (e.g. in constraint-based approaches to automated deduction 
[9,17,16]) and is often simpler than computing complete sets of unifiers. 

Theorem 2 . For any E-unification problem S : {si = ti, . . . , Sk = tfc} with free 
constants in C and variables Y = {yi , . . . , ?/„} the following are equivalent: 

(1) S has a solution w.r.t. E. 

(2) The formula 3yi , . . . , 2/m(si = ti A • • • A Sfc = tfc) is true in E^yj^fitf). 

(3) There exists h : Y ^ E^{C) such that h{si) = h{ti) for every 1 < i < k, 
where h : Ts{Y U C) ^ E^{C) is the unique extension of h to a homomor- 
phism, such that, for all c G C , h{c) = [c] (where [c] is the equivalence class 
ofcmE^iC)). 



Proof: (Sketch) The equivalence of (1) and (2) is proved e.g. in [7]. The equiv- 
alence of (2) and (3) follows from the fact that U{E^yjQ{Y)) is isomorphic to 
E^{Y U C), where U maps a if U C-algebra to a 27-algebra by forgetting the 
constants in the signature, i.e. U{{A, {aA}aGSuc)) = {A, {aA}aGs)- n 

The importance of if-unification with linear constant restrictions is justified 
by the following theorem. 

Theorem 3 ([5,3]). Let E be a non-trivial equational theory. The following 
statements are equivalent: 

(1) The positive theory^ of E is decidable. 

(2) General E -unifieation is decidable. 

(3) E-unification with linear constant restrictions is decidable. 

More precisely, as pointed out e.g. in [1], the decision problem for if-unification 
with linear constant restrictions can be reduced to the decision problem for 
general unification in linear time. The nondeterministic polynomial algorithm 
given in [3] can be used to reduce the decision problem for general unification 
to the decision problem for if-unification with linear constant restrictions. 

^ The positive theory of E is the collection of those closed formulae valid in 
the class of all models of E which are (equivalent to a formula) of the form 
(QiXi) . . . (CfmXjn) (\f — til A * * * A Sim — tinfif where , . . . , Qm G {3,V}. 
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3 Unification with Constants in Doi. General Remarks. 

We now study the unification problem with free constants in the equational class 
Doi of bounded distributive lattices. We denote by Dqi the equational theory of 
Doi- Let S = {V, A, 0, 1} be the signature of bounded distributive lattices. The 
following result is a direct consequence of Theorem 2. 

Corollary 1. For any Dqi -unification problem S : {si = ti, . . . , Sk = tfc}, with 
free constants in C and variables Y, the following are equivalent: 

(1) S has a solution w.r.t. Doi- 

(2) There exists h :Y ^ TboifC) such that h{si) = h{ti) for every 1 < i < k, 
where h : Ts{Y UC) — > Fd„-i{C) is the unique extension of h to a homomor- 
phism, such that h{c) = [c] for all c G C. 

By Corollary 1 and the fact that Fog„ (C) is finite for every finite C it follows that 
Doi-unification with free constants is decidable. This problem is co-NP hard: if 
S contains only one equation and no variables it reduces to the word problem 
for Doi, which has been shown to be co-NP hard [14]. 

We first present a straightforward (and rather inefficient) method for testing 
whether a unification problem has a solution. We then show how a simpler case 
(only one equation) is solved in [12]. In Section 4 we give a simpler method, 
which allows to test by resolution whether a unification problem has a solution. 

The Straightforward Method. Let S : {si = ti,...,Sk = tfc} be a Dqi- 
unification problem with free constants in a finite set C and variables in the finite 
set Y. We can check if S has a solution by checking if there is an instantiation of 
the variables in S with elements in Fd„i(C') that satisfies S. There exist at most 
(2^''^')l^l such instantiations. For each instantiation h : Y ^ Fd„i(C'), one has 
to check if h{si) =Dqi hfti), \ <i <k. There exists an algorithm for disproving 
the equivalence of two terms which is nondeterministically polynomial in the 
length of the terms [14]. The elements in Fd„i(C') can be written as disjunctions 
of conjunctions of elements in C; the length of such a term is at most \C\ ■ 2l'^L 
Hence, the length of h{si) and h{ti), 1 < z < fc, can at most be |T| • \C\ ■ 2l'^l -|- 
max(5), where max(5) is the maximal length of a term occurring in S. 

A Special Case. In [12] Gerhard and Petrich present the following criterion for 
unifiability for one single equation, i.e. for the unification problem S : {s = f}. 

1. Let s' and t' be the disjunctive normal forms of s resp. t. 

2. If neither s' or t' has a constant term^ then s and t are unifiable. 

3. If s' or t' has constant terms, let h: Y^Tdoi{C) be defined by h{x) = D for 
every x G Y, where D is the disjunction of all constant terms in s' and t' . 

4. If h{s) =Doi h(t) then s and t are unifiable, otherwise they are not unifiable. 

^ If s = Vi Ajg/ -Sj ill disjunctive normal form, then a constant term of s is any of 
the conjunctions Ajg/ in ^ not containing any variable. 




On Unification for Bounded Distributive Lattices 



471 



The disjunction D in Step 3 can be determined in polynomial time w.r.t. 
length(s') + length(t'). The same holds for the process of replacing every variable 
in s and t by D. Both the length of D and the length of the result of replacing all 
variables in s,thy D (h{s) resp. h{t)) is polynomial in length(s') + length(t'); but 
may be exponential in length(s) + length(t). Hence, the complexity of the criterion 
above is given by the complexity of Step 1 (computing the disjunctive normal 
forms of s and t) and Step 4 (solving a word problem) . The last problem is co-NP 
complete [14]; there exists an algorithm for disproving the equivalence of h{s) 
and h{t) which is nondeterministically polynomial in length(/i(s)) + length(/i(t)). 

4 A Resolution-Based Algorithm 

We give a simpler algorithm for the problem of deciding whether a Doi-unification 
problem with free constants has a solution. The algorithm consists of two steps: 

1. Structure-preserving translation to clause form: 

— Reduce the problem of testing the satisfiability of a unification problem 
S to checking the satisfiability of a set (Ps of clauses. 

— Show that (I>s can be expressed as a set of constrained clauses. 

2. Checking satisfiability by ordered resolution with selection: 

— Use ordered resolution with selection for constrained sets of clauses to 
test the satisfiability of (Ps- 



4.1 Structure-Preserving Translation to Clause Form 

In this section we reduce the problem of testing the satisfiability of a unification 
problem S to that of checking the satisfiability of a set of clauses. We do this 
in two steps: Theorem 4 shows that Fdoi(C) can be replaced with the lattice 
of order-filters of (V{C), C); Theorem 5 further reduces the problem to that of 
checking the satisfiability of a set of first-order (ground) clauses. 

Theorem 4. For any Dqi - unification problem S : {si = . . . ,Sk = tfc} with 

free constants C and variables Y , the following are equivalent: 

(1) S has a solution w.r.t. Hoi- 

(2) There exists h :Y ^ Fdoj(C) such that h{si) = h(ti) for every 1 < i < k, 
where h : Ts{Y U C) ^ .Fboi(C') Ihe unique homomorphism that extends 
h, such that h{c) = [c] for all c G C. 

(3) There exists g : Y ^ 0(fP{C), C) such that g{si) = g(ti) for every I < i < 
k, where : Ts{Y U C) ^ 0{V{C),Cf) is the unique homomorphism that 
extends g, such that 'g{c) = t{c} = {X C \ c G X} for every c G C. 

Proof: (Idea) The equivalence of (1) and (2) follows directly from Corollary 1. 
The equivalence of (2) and (3) follows from the fact that there exists a 0,1-lattice 
isomorphism rjc : Fdoi(C) — > 0(V(C),C) defined for every t G Fog^(C) by 
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Vc(t) = {/ ^(1)0(7 I / : Fdoi(C') — > 2 is a 0,1-lattice homomorphism; f{t) = 1} 
such that for every c & C , rjc{[c\) = ^{c\. □ 

Theorem 4 justifies a reduction of the problem of checking whether a unification 
problem with constants S has a solution to the problem of checking the satisfi- 
ability of a system of set constraints. This reduction can be then used to give a 
structure-preserving translation to clause form. Thus, the problem of checking 
whether a unification problem with constants S has a solution can be reduced 
to the problem of checking the satisfiability of a set of clauses. 

The structure-preserving translation to clause form is inspired by Tseitin’s 
well-known method for transforming quantifier-free formulae to clausal normal 
form and by the ideas in [20]. The link with set constraints mentioned above 
also explains the similarities with the structure-preserving translation to clause 
form presented in [6] . The remarks above are formally expressed by the following 
theorem. 

Theorem 5. Let S : {si = t\, . . . , Sk = tfc} he a Dqi - unification problem with 
free constants C, and variables Y = {j/i, . . . , y„}. Let ST{S) be the set of all 
subterms of terms occurring in S. The following are equivalent: 

(1) There exists h : {y \, . . . , y„} — > 0{V{C), C) such that h{si) = h(fi) for every 
I < i < k, where h : Ts{Y UC) — > 0{V{C), C) is the unique homomorphism 
that extends h, such that h{c) = |{c} for every c G C. 

(2) There exists a family {Ie}e^ST(S)> such that L^ C V{C) for all e G ST{S), 
and for all X, X\, X 2 C the following hold: 

— if G ly and Xi C X 2 then X 2 G ly, for every y G {yi , . . . , yn}; 

X G de-^/\e2 iff X G and X G 7e2, 

— X G 7eiVe2 iff X G 7ei Or X G 1^2 ! 

— lo — 0; 7i = C; and for every cGC,XgIc iff cGX; 

— X G Isi iff X G Iti for all 1 < i < k. 



(3) The conjunction of the following formulae 


is satisfiable: 


(Her) 


P,(Xl) ^ Py{X2) 


for all X 1 GX 2 G C, 






y G {yi, ■■■,yn} 


(Ren) (An) 


Pe^r.e2{X) ^ Pe,(X) 


for all X G C,i = 1,2 


(Ap) Pei(X) A Pe2(X) ^ PeiAe2(X) 


for allX CC 


(Vn) 


Pe,Ve2(X)^Pei(X)VPe2(X) 


for allX CC 


(Vp) 


Pe,(X) ^ Pe,Ve2(X) 


for all X C C,i = 1,2 


(1) 


A(X) 


for allX CC 


(0) 


^PoiX) 


for allX CC 


(cp) 


Pc{X) 


for all X C C with cG X 


(cn) 


-Pc(X) 


for all X C C with c ^ X 


(P) 


Ps,{X) ^ Pt,{X), 


for all X C C, for all 1 < 



where each formula in (Her) U (Ren) U (P) is the conjunction of all formulae 
obtained by instantiating the variables X, resp. Xi,X 2 with subsets of C 
satisfying the additional conditions; the indices e\ V 62,61 A 62 , 0 , l,c range 
over all elements in ST{S); y ranges over all variables in {yi , . . . , y„}. 
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Proof: (Sketch) (1) (2). For every e G ST{S) let le := h{e). Since ft- is a 

0,1-homomorphism with ft(c) = t{c}j and the lattice operations in 0{V{C), C) 
are union and intersection, the family {de}eeST( 5 ) satisfies the conditions in (2). 

(2) (3) Let {Le}eGSr( 5 ) be a family satisfying the conditions in (2). Then 
{'P{C),J), where I{Pe) ■= for all e G ST{S), is a model for (Her)U(Ren)U(P). 

(3) (1) Assume that (Her)U(Ren)U(P) (a conjunction of ground clauses) is 

satisfied by the map I : {Pe{X) \ e G ST{S),X C C} ^ {0, 1}. For every y GY 
let ft(y) := {X G V{C) \ I{Py{X)) = 1}. Let ft : Te{YU C) 0{V{C),C) 
be the unique homomorphism that extends ft, such that ft(c) = t{c} for every 
c G C. As X satisfies (Her) U (Ren), ft(e) = {X G V{C) \ X(Pe(A)) = 1} for all 
e G ST{S). Since I satisfies (P), ft(si) = ft(U) for every 1 < z < ft. □ 

Corollary 2. The Dqi - unification problem S : {si = fi, . . ., Sfc = tk} with free 
constants C has a solution w.r.t. the equational theory o/Dqi iff the set of clauses 
(Her) U (Ren) U (P) is satisfiable. 

The satisfiability of (Her) U (Ren) U (P) can be checked for instance by resolution. 

We now give an upper bound on the complexity of deciding the satisfiability of 
(Her) U (Ren) U (P), i.e. for deciding the unifiability of S. 

Theorem 6. (1) The problem of deciding whether the Dqi - unification problem 
S has a solution can be solved in at most non- deterministically polynomial time 
in |ST(5)|2l'^l (and in exponential time in |ST(5)|2l'^l by using resolution). 

(2) If S only contains the operation symbols A,0, 1, and, possibly, constants, 
then the problem can be decided in at most polynomial time in |S'T(5)|2l'^l. 

Proof: Note first that the structure-preserving translation to clause form in The- 
orem 5 is polynomial in |ST(5)|2l‘^L The size of the conjunction of all formulae 
in (Her) U (Ren) U (P) is also polynomial in |5T(5)|2l‘^l. (1) follows from this and 
the fact that the number of all distinct literals that can occur in the conjunction 
of ground clauses (Her) U (Ren) U (P) in Theorem 5(3) is bounded by |ST(5)|2l'^l. 
To prove (2) note that if only the operators A, 0, 1 and, possibly, constants, occur 
in S, then the clause form of (Her) U (Ren) U (P) is a set of ground Horn clauses. 
Dowling and Gallier [11] showed that satisfiability of a set of ground Horn 
clauses can be proved in linear time w.r.t. the number of clauses in □ 

Note. It is not necessary to explicitly add to (Her) U (Ren) U (P) formulae ex- 
pressing the order relationship between the elements in V{C). However, these 
relationships have to be known when expressing (Her) U (Ren) U (P) as the con- 
junction of ground formulae by instantiating the variables with elements in 'P(C'). 



4.2 Translation to Constrained Clause Form 

The clause form of the set of formulae (Her) U (Ren) U (P) defined in Theorem 5(3) 
can be naturally expressed by constrained clauses of a special form as follows: 
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(Her) 

(Ren) 



(P) 



(An) PejA 62 (^) 

(Ap)Pei(X)APe 2 (X) 
(Vn) PejV 62 (^) 

(Vp) Pe,(X) 

(1) A(X) 

(0) ^Po(X) 

(cp) Pc{X) 

(cn) ^P,(X) 

PsAX) 



Py(X2) lXiQX 2 CCI,y€Y 

P.AX) [XCCl,i = l ,2 

Pe,Ae2(X) [X C C] 

Pe,(X)VPe2(X) [XCCl 
Pe,Ve2(X) [XCCl,i=l,2 

[X C c, c G XI 
[XCC7,c^Xl 

Pt,(X), [X C Cl, for all 1 < i < fc 



Definition 3. A constrained clause has the form -D|())], where (i) D is a first- 
order clause with variables X, Xi, . . . , X„, . . . ranging over a countably infinite 
set V; all predicates occurring in D are unary; and (ii) the constraint (j) is of the 
form G Xi) A ^ Xi) A f\i^j^j^j^{Xi C Xj) A f\i^j^u...uj^{Xi C C). 

Let 5 be a Dqi - unification problem, and <Ps the set of constrained clauses as- 
sociated with S as explained above. Then can be constructed in polynomial 
time in the size of S. The size of <Ps is polynomial in the size of S. 

A substitution of the variables in V is called ground when it replaces ev- 
ery variable by an element of V{C) (this is the Herbrand universe of (Her) U 
(Ren) U (P)). A constrained clause D|())] represents the set (D|())])® = {Da \ 
a ground; 4>a true} of all ground instances of D by instantiations of the vari- 
ables which satisfy (f. We say that a set <d> of constrained clauses is satisfiable if 
the set of all its ground instances, = UoMeAf satisfiable. 

4.3 A Resolution Calculus for Constrained Clauses 

We now formulate a resolution calculus CRes^ for the type of constrained clauses 
considered here. The calculus is parameterized by a total ordering on the 
predicate symbols, and a selection function S that assigns to each constrained 
clause a (possibly empty) multiset of (occurrences) of negative literals, 

called the selected literals of D|())]. CRes^ consists of the following rules^. 

Ordered Resolution. 

£>1 V Pe(X) |().i(X, Xi, . . . , X„)l £>2 V ^Pe{Z) 14>2{Z, Zi, . . . , Z„)l 
Di V D2a \(f>i{X, Xi, . . . , X„) A 4>2{Z, Zi, . . . , X„)(t] 

where a{Z) = X and a{W) = W in rest; Pe is the largest predicate symbol 
in Di V T’e(X) and no literal is selected in Di V Pg(X), and either ^Pe{Z) 

® The ordered resolution calculus with selection Res^ is complete for any well-founded 
and total ordering on ground literals and any selection function S. Here we consider 
a less restrictive form of resolution for constrained clauses, in order to simplify the 
presentation by avoiding the necessity of also handling order constraints (w.r.t. ;^). 
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is selected in D 2 V or otherwise nothing is selected in D 2 V ^Pe{Z) 

and Pe is the largest predicate symbol in D 2 V ^Pe{Z). 

Positive Factoring. 

D\/ Pe{X)\/ PejZ) l4>{X,Z,Xi,...,Xn)} 

Da V Pe{X) \(j){X, Z,Xi,..., Xn)aj 

where a{Z) = X, and a{W) = VU in rest; Pe is the largest predicate symbol 
in D V Pe{X) V Pe{Z) and nothing is selected in D V Pe(X) V Pe{Z). 



Theorem 7. Let <P he a set of constrained clauses, >- a total order on the pred- 
icate symbols, and S a selection function. is unsatisfiable iff the empty clause 
□ |(/)] (constrained by a satisfiable constraint 4>) is derivable from in CRes^ . 

Proof: (Idea) The proof uses the completeness of ordered resolution with selec- 
tion for ground clauses and a lifting lemma for constrained clauses; the arguments 
are similar to those in [9] . □ 

Example: Decide whether the Dqi - unification problem S : {y A c = 0, y V c = 1} 
has a solution, where c is a constant and y a variable. 

(Note that S corresponds to the formula: Vc3y(y A c = 0 and y\J c= 1.) 



Solution: Let A be a total ordering on the predicate symbols, defined such that 
Pe >- Pe' whenever (i) e' is a subterm of e; or (ii) e is a non-atomic term and c a 
constant; or (iii) e is a non-atomic term or a constant and a; is a variable. Let S 
be a selection function that selects all negative occurrences of literals except in 
Ren(Vp, Ap), where nothing is selected. By the structure-preserving translation 
to clause form in Theorem 5(3) we obtain the following set of constrained clauses: 



( 1 ) PyjX) ^ PyjY) [NCyCUl 

( 2 ) Py^cjX) ^ PyjX) IN GUI 



(3) Py^cjX) ^ Pe(X) IX C d 

(4) Py{X)APe{X)^\Py^e{X)\ INCCl 

(5) PyVcjX) ^ Py{X)V Pe{X) IX C d 

(6) Py{X) ^ PyVcjX) IN C d 

(7) Pc(N)^ P.vc(N) INCd 

(8) ^Po(N) IN C d 

(9) Pi(N) IN C d 

(10) Pc(N) INCU,c€N1 

(11) -.Pe(X) INCdc^Nl 

(12) PyA.c(X) ^ Po(N) IN C d 

(13) Po(N) ^ Py^e(X) IN C d 

(14) PyvcjX) ^ Pi(N) INCd 

(15) Pi(N) ^ P,vc(N) INCd 



where C = {c}, the selected literals are underlined and the positive literals con- 
taining the maximal predicate symbol are in boxes. All ground inferences of (13) 
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and (14) are redundant (so, (13) and (14) can be considered to be redundant). 
We obtain the following deduction of the empty clause □: 



(16) 


Py(X)A 


Pe{X) ^ Po(X) [X C d 




(by 


(12) 


and 


(4)) 


(17) 


Py(X) 


■ Po(X) 


[c G X, X C d 




(by 


(10) 


and 


(16)) 


(18) 


PyvciX) 




IX 




(by 


(15) 


and 


(9)) 


(19) 


P,(X)V 


Pc{X) 


C d 




(by 


(18) 


and 


(5)) 


(20) 


Py{X) 




[c^X,XCd 




(by 


(19) 


and 


(11)) 


(21) 


Py{Y) 




[c ^ X, X c y c d 




(by 


(20) 


and 


(1)) 


(22) 


Po(X) 




[c ^ X, c G X, X C d 




(by 


(20) 


and 


(17)) 


(23) 


□ 




[c ^ X, c G X, X C Cl 




(by 


(22) 


and 


(8)) 


(24) 


Po{Y) 




[c ^ X, c G y, X c y c 




(by 


(21) 


and 


(17)) 


(25) 


□ 




[c ^ X, c G y, X c y c 




(by 


(21) 


and 


(17)) 



The constraint in (23) is unsatisfiable, but the constraint in (25) is satisfiable 
(e.g. by X = 0 and V = C)^. So, the set consisting of the clauses (1)-(15) is 
unsatisfiable, hence S has no solution. 



4.4 Complexity Considerations for Some Special Cases 

We now analyze some situations in which deciding Hoi-unifiability is especially 
easy. We start by showing that for unification problems of the form S : {s = t} 
the algorithm performs well. We end by analyzing the complexity of unification 
without constants. 

Unification with Free Constants: General Case. Let 5 be a unification 
problem. Let be a total ordering on the predicate symbols {Pe \ e G ST{S)}, 
defined such that Pg >- Pe' whenever (i) e' is a subformula of e; or (ii) e is 
a non-atomic formula and c a constant; or (iii) e is a non-atomic formula or a 
constant and y is a variable. Let S be a selection function that (i) selects nothing 
in Ren(Vp, Ap), and (ii) in every other non-positive clause selects the set of all 
occurrences of negative literals that contain the maximal predicate symbol(s) 
among those occurring in the negative literals in the clause. Then: 

1. no inferences are possible between (Her) and (Ren); 

2. all inferences between two clauses in (Ren) generate tautologies; 

3. no inferences are possible between (Her) and clauses in (P); 

4. inferences between Ps^X) — > Pti{X) in (P) and clauses in (Ren) lead to: 

(a) /\j^jPei{X) Pti{X) fX C C], where for every J, {e* | j G J} is a 
multiset of subterms of ST{si), containing no repetition of subterms that 
occur at the same position in s^; 

(b) (A Pc, (^)) A (A p., (X)) ^ Pt, (X) [X C U d, G X, i G 71; 

(c) A Px, (X) ^ Pu (X) llXCC,di€X,i€l]; 

(d) Pt,(X) lXCC,di€X,i€l]. 

^ This shows that inferences with the clause Her in Theorem 5(3) (in particular, with 
clause (1) for this example) are necessary for the correctness of the method. 
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There are at most such clauses; all constraints are linear in 

length(si). 

5. inferences between Pti{X) fX C C,di e X,i e /] and clauses in (Ren) lead to: 
(a’) \/ P^i{X) \X <ZC,di £ X,i £ J|, where for every J, {e] | j G J} is a 

multiset of subterms of ST(ti), containing no repetition of subterms that 
occur at the same position in ti; 

(b’) (V Pc, (X)) V (A P:r, (X)) l[XCC,d,GX,iG I, 4 ^X,kG Kj; 

(c’) \JP^,{X) \X<GC,diGX,iGl,d't,^X,kGKl 
(d’) □ lX^C,diGX,iGl\. 

or factors thereof. There are at most such clauses; all constraints 

are linear in length(si) + length(ti). 

6. inferences between clauses of type (c’) and (Her) produce clauses of the form, 
(e’) \JP^,{Xi) \X <Z C,di G X,i G I,d'^ ^ X,k G K,X <G Xif, 

7. further inferences may be possible between clauses of the form (a)-(c) and 
clauses of type (d) or (a’)-(c’) and (e’) and resolvents thereof. (These in- 
ferences can be further controlled, e.g. by adopting an additional label- 
ing of the predicate symbols Pg that also indicates in which of the terms 
ti, . . . ,tk, si, . . . , Sk and at which position e occurs; and by defining redun- 
dancy criteria. Here, we do not enter in further details.) 

Corollary 3. If all clauses generated from (Her) U (Ren) (J (P) by CRes^ contain 
a negative literal then S has a solution. 

Proof: Follows from the remarks above and the fact that if all clauses generated 
contain a (selected) negative literal then □ cannot be obtained by CRess . □ 

This happens e.g. if for all 1 < i < fc, the disjunctive normal forms of Si,ti do 
not contain constant terms. 

Unification with Ftee Constants: One Equation. If S consists of only one 
equation, the last type of inferences can be proved to produce only constrained 
clauses with the property that all their ground instances are already subsumed 
by the set of ground instances of clauses of type (d) previously generated, as will 
be shown in Corollary 4. 

Lemma 1. The satisfiability of a constraint 4> of size \4>\ can be checked in time 
linear in \C\ ■ \4>\ . 

Proof: (Sketch) Note first that a constraint 

<j)= /\ {a G X,) A /\ (A ^X,)A /\ {X, C X,) A /\ (X, C C) 

ieii ie/2 ie/aj'eU ie/iU---u/4 

is satisfiable iff the set 

E<l> = {Pxi(ci) I i G Ii}U{^Px,{dj) I j G l 2 }Al{Pxi{x) Px,{x) \ i G h,j G I 4 } 

of Horn clauses is satisfiable by a Herbrand interpretation. Moreover, is 
satisfiable iff the set of all its ground instances (which has cardinality |C| x |(()|) 
is satisfiable. Since all clauses in G(j, are ground Horn clauses, it follows by results 
in [11] that the satisfiability of G^f, can be checked in time linear in the size of 
G(f, i.e. linear in jCj x \f)\. □ 
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Corollary 4. LetS : {s = i} consist only of one equation. Then the satisfiability 
of S can be checked in time + length(i)). 

Proof: (Idea) In a first step, by inferences with (Ren), Ps{X) Pt{X) generates 
literals of the form (a)-(c), and, possibly of type (d). Let Pt{X)\(j)i\,i S / be 
all clauses of type (d) generated this way. Similarly, let Ps{X)\(f)b\, j G J be 
all clauses of type (d) generated from Pt{X) Ps{X). By the remarks at the 
beginning of this subsection, (ft, (f>j contain only constraints of the form X C C 
and Ci G X. By inferences between Pt{X)l(j)il, i G I and Pt{Z) — > Pg{Z) \Z C C] 
the clause Pslf/ii] is generated. generated, for all j G J, in a similar 

way. The constraint of a clause of type (a’)-(d’) contains, as a conjunct, one of 
the constraints (j)i or (()'. Let Di Pi(X)|^/>] be one of the clauses in (a)-(c). 
A resolution with a clause of type (a’)-(d’) would produce a clause of the form 
D[ Pt\if t\ (j)i p\. The set of ground instances of such a clause is redundant 
with respect any set of clauses that contains all ground instances of Pt 
hence (by the proof of Theorem 7) all such clauses can be considered redundant. 

Similar arguments concerning redundancy of generated clauses can be used 
to control the inferences with (Her) and to prove that all resolvents of clauses of 
type (a’)-(d’) as well as resolvents of inferences with Her and clauses of type (a)- 
(d) have the property that all their ground instances are subsumed by ground 
instances of clauses of type (d). The conclusion follows since clauses are 
generated this way and the constraints can be checked in linear time. □ 

Unification without Constants in Dpi. If C = 0, then Fdoj(C') is the two- 
element lattice. By Theorem 2, a Uoi-unification problem S : {si = ti, . . . , Sk = 
tk} with variables {yi , . . . , y„} and no constants has a solution w.r.t. the equa- 
tional theory of Dqi iff the existential closure of S, 3yi , . . . y„(si = A • • • A Sfc = 
tk) is valid in the two-element lattice. The number of clauses corresponding to 
(Her) U (Ren) U (P) in Theorem 5 is in this case polynomial in ST{S). 

Theorem 8 (Complexity). Let S be a Dqi - unification problem without con- 
stants. Assume that all terms in S have been simplified by (recursively) applying 
the following simplification rules^: eAli-^-e;eA0i-^0;eVli— >l;eV0i-^-e. 

(1) If {0,1} ^ ST{S), then S always has a solution. 

(2) // {0, 1} C ST{S) and S consists of only one equation (or else it contains 
the equation 1=0) then S has no solution. 

(3) If S only contains the operators A, 0, 1, then the problem of checking whether 
S has a solution can be solved in polynomial time. The same holds if S only 
contains the operators V,0, 1. 

(4) Otherwise, the problem of checking whether S has a solution is NP-complete. 

Proof: (Sketch) (1) Assume that 1 ^ ST{S). Let d>s be the set of clauses associ- 
ated to S by the structure-preserving translation to clause form in Theorem 5(3). 

This can be done in polynomial time. 



5 
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Since no constant occurs in S, all the clauses <^5 are non-positive, so <^5 is satis- 
fiable (consider a selection function that selects all negative literals in all clauses; 
then no resolution inference is possible). The case 0 ^ ST{S) follows by duality. 

(2) is obvious, and (3) follows from the second part of Theorem 6. 

(4) The problem of deciding whether a unification problem without constants 
has a solution is clearly in NP. NP-hardness follows from the fact that the satis- 
fiability problem for Boolean formulae of the form E = F A^G, where F and G 
only contain the operators V and A (which is NP-complete [15]) can be reduced 
in polynomial time to the satisfiability of a Doi-unification problem with at least 
two equations Si = 0 and sj = 1, namely S : {F = 1,G = 0}. □ 



4.5 Unification with Linear Constant Restrictions 



We now show that the idea used in the method described in Section 4. 1 can be 
adapted to give an algorithm for unification with linear constant restrictions. 

We first express the fact that t e Fdoj(C'\{c}) by using the isomorphism 
VC ■ ^Doi(C') ^ 0{V{G), C) defined for every t e Fdo^ [g) by ric(t) = n 

G \ f ■ Fdoi(C') — > 2 is a 0,1-lattice homomorphism; f{t) = 1} (cf. Theorem 1). 

Lemma 2. IftG Fog^{G) then there exists t' G .^Doi (C'\{c}) such that t =Doi t' 
iffvcit) = % or C\{c} G vcit)- 

Proof: (Idea) This is a consequence of the fact that for every t G Fdoj (C), there 
exists t' G T'doi(C'\{c}) such that t =Doi t' iff for every X C C, if c G X then 
either X vcif) or X is not minimal in vcif)- The proof of the equivalence 
above uses the way rju is defined for every D, and the fact that there exists an 
injective homomorphism i : Tboi(C'\{c}) ^ Fdoj(C'), such that for every U G 
0{V{G\{c\)), Vc{i{Vc\{c]^U))) is the order-filter generated by U in {V{G), C). 

□ 

As in the case of unification with free constants, the remark above justifies 
a structure-preserving translation to clause form. 

Theorem 9. Let S : {si = t\, . . . , Sk = tfc} he a Dqi - unification problem with 
linear constant restrictions Lcr, constants G and variables Y. The following are 
equivalent: 

(1) S has a solution w.r.t. the equational theory o/Dqi. 

(2) The conjunction of the following set of formulae is satisfiable: 



(Her) P,(Ai) 
(Ren) 

Pei V62 {X) 
PfiX) 
-Po(X) 
Pc{X) 

-Pe(X) 
(Lcr) Py{X) 

(P) PsfiX) 



Py{X2) 

P, fix) A P. fix) 
P,fiX)vP.fiX) 



Py(C\{c}) 

PtfiX), 



for all XiCX 2 CC,yeY 

for allX CC 

for allX CC 

for allX CC 

for allX CC 

for all X C C with c G X 

for all X C C with c ^ X 

for all X C C if y < c G Lcr 

for all 1 < i < k 
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Proof: (Sketch) This follows from Definition 2, Lemma 2, and arguments similar 
to those used for Theorem 5. □ 

5 Conclusion 

We presented a resolution-based method for deciding unifiability w.r.t. the equa- 
tional theory of bounded distributive lattices with operators. The method uses 
the Priestley representation for bounded distributive lattices, in particular the 
description of the dual (C)) of the free lattice in Dqi over C as (V{C), C). 

This helped us to reduce the problem of checking whether a Doi-unification 
problem S with constants C (and linear constraint restrictions) has a solution, to 
the problem of checking the satisfiability of a set <Ps of clauses. <Ps can be repre- 
sented both as a finite set of ground clauses, and as a set of constrained clauses; 
the last representation is much more compact. We formulated a resolution calcu- 
lus for such constrained clauses and proved its soundness and completeness. The 
method we give is in general still exponential in |ST(5)|2‘^. However, in several 
situations it is more efficient than other existing methods: syntactic information 
about the terms in S is sometimes reflected by the form of clauses, which allows 
us to establish better upper bounds for these particular problems. Our algorithm 
also behaves well for unification problems consisting of only one equation. 

It would be interesting to compare our method with more general unification 
algorithms, e.g. based on rewriting. One such algorithm [8] decides unifiability 
over the free algebra, i.e. in an algebraic extension of the free algebra. We an- 
alyzed this more general problem for the equational theory of Dqi. As part of 
work in progress, we prove that this reduces to Boolean unifiability (due to the 
fact that the algebraically closed elements of Dqi are the Boolean lattices). 
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Abstract. While there has been a great deal of work on the development of rea- 
soning algorithms for expressive description logics, in most cases only Tbox rea- 
soning is considered. In this paper we present an algorithm for combined Tbox 
and Abox reasoning in the ST-LTQ description logic. This algorithm is of particu- 
lar interest as it can be used to decide the problem of (database) conjunctive query 
containment w.r.t. a schema. Moreover, the realisation of an efficient implemen- 
tation should be relatively straightforward as it can be based on an existing highly 
optimised implementation of the Tbox algorithm in the FaCT system. 



1 Motivation 

A description logic (DL) knowledge base (KB) is made up of two parts, a termino- 
logical part (the terminology or Tbox) and an assertional part (the Abox), each part 
consisting of a set of axioms. The Tbox asserts facts about concepts (sets of objects) 
and roles (binary relations), usually in the form of inclusion axioms, while the Abox 
asserts facts about individuals (single objects), usually in the form of instantiation ax- 
ioms. For example, a Tbox might contain an axiom asserting that Man is subsumed by 
Animal, while an Abox might contain axioms asserting that both Aristotle and Plato 
are instances of the concept Man and that the pair (Aristotle, Plato) is an instance of 
the role Pupil-of. 

For logics that include full negation, all common DL reasoning tasks are reducible 
to deciding KB consistency, i.e., determining if a given KB admits a non-empty inter- 
pretation [6]. There has been a great deal of work on the development of reasoning 
algorithms for expressive DLs [2,12,16,1 1], but in most cases these consider only Tbox 
reasoning (i.e., the Abox is assumed to be empty). With expressive DLs, determining 
consistency of a Tbox can often be reduced to determining the satisfiability of a single 
concept [2,23,3], and — as most DLs enjoy the tree model property (i.e., if a concept has 
a model, then it has a tree model) — this problem can be decided using a tableau-based 
decision procedure. 

The relative lack of interest in Abox reasoning can also be explained by the fact that 
many applications only require Tbox reasoning, e.g., ontological engineering [15,20] 
and schema integration [10]. Of particular interest in this regard is the DL STCIQ [18], 
which is powerful enough to encode the logic VCTZ [10], and which can thus be used 
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forreasoning about conceptual data models, e.g., Entity-Relationship (ER) schemas [9]. 
Moreover, if we think of the Tbox as a schema and the Abox as (possibly incomplete) 
data, then it seems reasonable to assume that realistic Tboxes will be of limited size, 
whereas realistic Aboxes could be of almost unlimited size. Given the high complexity 
of reasoning in most DLs [23,7], this suggests that Abox reasoning could lead to severe 
tractability problems in realistic applications.^ 

However, STCIQ Abox reasoning is of particular interest as it allows VCTZ schema 
reasoning to be extended to reasoning about conjunctive query containment w.r.t. a 
schema [8]. This is achieved by using Abox individuals to represent variables and con- 
stants in the queries, and to enforce co-references [17]. In this context, the size of the 
Abox would be quite small (it is bounded by the number of variables occurring in the 
queries), and should not lead to severe tractability problems. 

Moreover, an alternative view of the Abox is that it provides a restricted form of 
reasoning with nominals, i.e., allowing individual names to appear in concepts [22,5,1]. 
Unrestricted nominals are very powerful, allowing arbitrary co-references to be en- 
forced and thus leading to the loss of the tree model property. This makes it much harder 
to prove decidability and to devise decision procedures (the decidability of STCI Q with 
unrestricted nominals is still an open problem). An Abox, on the other hand, can be 
modelled by a. forest, a set of trees whose root nodes form an arbitrarily connected 
graph, where number of trees is limited by the number of individual names occurring 
in the Abox. Even the restricted form of co-referencing provided by an Abox is quite 
powerful, and can extend the range of applications for the DLs reasoning services. 

In this paper we present a tableaux based algorithm for deciding the satisfiability 
of unrestricted STCIQ KBs (i.e., ones where the Abox may be non-empty) that ex- 
tends the existing consistency algorithm for Tboxes [18] by making use of the forest 
model property. This should make the realisation of an efficient implementation rela- 
tively straightforward as it can be based on an existing highly optimised implementation 
of the Tbox algorithm (e.g., in the FaCT system [14]). A notable feature of the algo- 
rithm is that, instead of making a unique name assumption w.r.t. all individuals (an 
assumption commonly made in DLs [4]), increased flexibility is provided by allowing 
the Abox to contain axioms explicitly asserting inequalities between pairs of individual 
names (adding such an axiom for every pair of individual names is obviously equivalent 
to making a unique name assumption). 



2 Preliminaries 

In this section, we introduce the DL STCIQ. This includes the definition of syntax, se- 
mantics, inference problems (concept subsumption and satisfiability, Abox consistency, 
and all of these problems with respect to terminologies^), and their relationships. 

STCI Q is based on an extension of the well known DL ACC [24] to include tran- 
sitively closed primitive roles [21]; we call this logic S due to its relationship with 

* Although suitably optimised algorithms may make reasoning practicable for quite large 
Aboxes [13]. 

^ We use terminologies instead of Tboxes to underline the fact that we allow for general concept 
inclusions axioms and do not disallow cycles. 
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the proposition (multi) modal logic S4(rn) [23].^ This basic DL is then extended with 
inverse roles (X), role hierarchies (7T), and qualifying number restrictions (Q). 

Definition 1. Let C be a set of concept names and R a set of role names with a subset 
R+ C R o/ transitive role names. The set o/ roles is R U {i? | R G R}. To avoid 

considering roles such as R , we define a function Inv on roles such that lnv(i?) = 
R~ if R is a role name, and lnv(i?) = S if R = S~ . VTc also define a function Trans 
which returns true iff R is a transitive role. More precisely, Trans(i?) = true iff R G 
R-i_ or lnv(i?) G R+. 

A role inclusion axiom is an expression of the form R Q S, where R and S are 
roles, each of which can be inverse. A role hierarchy is a set of role inclusion axioms. 
For a role hierarchy TZ, we define the relation E to be the transitive-reflexive closure 
of\G over TZ U {lnv(i?) E Inv(S') | i? E S' € TZ}. A role R is called a sah-mle. (re sp. 
super-role) of a role S if Rif S (resp. S S.R). A role is simple if it is neither transitive 
nor has any transitive sub-roles. 

The set qfSTTXQ-concepts is the smallest set such that 

— every concept name is a concept, and, 

— ifC, D are concepts, R is a role, S is a simple role, and n is a nonnegative integer, 
then C n D, C Li D, ~^C, ^R.C, 3R.C, f-nS.C, and ^nS.C are also concepts. 

A general concept inclusion axiom (GCI) is an expression of the form C E D for two 
STCT Q-concepts C and D. A terminology is a set ofGCIs. 

Ixt I = {a, b,c . . .} be a set of individual names. An assertion is of the form a : C, 
(a, b) : R, or a bfor a,b G 1 , a (possibly inverse) role R, and a STLI Q-concept C. 
An Abox is a finite set of assertions. 



Next, we define semantics of STCIQ and the corresponding inference problems. 



Definition 2. An interpretation X = -^) consists of a set /Tff , called the domain 

o/X, and a valuation which maps every concept to a subset of and every role to 
a subset of A^ x A^ such that, for all concepts C, D, roles R, S, and non-negative 
integers n, the following equations are satisfied, where '^M denotes the cardinality of a 
set M and {Tt^)'^ the transitive closure of R^ : 

Ri = + 

e R^} 



(R-f 

(CnD)^ 
(C U Df 

(3R.C)^ 

I'iR.C)'^ 

f^nR.Cf 



= {{x,y) I {y,x 
= c^nD^ 

= C^UD^ 

= A^\C^ 

= {a; I 3y.{x,y) 

= {x I yy.{x,y) 
= {x I 'i{y-{x,y) 



G R^ and y G C^} 

G R^ implies y G C^} 

G iff' and y G C^} ^ n} 



for each role R G R+ 
(inverse roles) 
(conjunction) 
(disjunction) 
(negation) 
( exists restriction ) 
(value restriction) 
(^-number restriction) 
(^-number restriction) 



{^nR.C)^ = {x I ^{y.{x, y) G iff and y G C^} ^ n} 

An interpretation X satisfies a role hierarchy TZ iff iff C for each R LG S in TZ. 
Such an interpretation is called a model ofTZ (written X \=TZ). 



^ The logic S has previously been called ACCn+, but this becomes too cumbersome when 
adding letters to represent additional features. 
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An interpretation X satisfies a terminology X iffC^ Q for each GCI C Q D in 

T. Such an interpretation is called a model o/T (written X X). 

A concept C is called satisfiable with respect to a role hierarchy TZ and a termi- 
nology X iff there is a model X of TZ and X with ^ 0. A concept D subsumes a 
concept C w.r.t. TZ and X iff C holds for each model X of TZ and X. For an 

interpretationX, an element x G is called an instance of a concept C iffx G . 

For Aboxes, an interpretation maps, additionally, each individual a G 1 to some 
element aP G A^. An interpretationX satisfies an assertion 

a:C iff G C^, 

(a, b): R iff {aP , b^) G , and 
a p b iff aP p b^ 

An Abox A is consistent w.r.t. TZ and X iff there is a model X ofTZ and X that satisfies 
each assertion in A. 

For DLs that are closed under negation, subsumption and (un) satisfiability can be mutu- 
ally reduced: C C D iff C □ is unsatisfiable, and C is unsatisfiable iff C C An^A 
for some concept name A. Moreover, a concept C is satisfiable iff the Abox {a : C} is 
consistent. It is straightforward to extend these reductions to role hierarchies, but ter- 
minologies deserve special care: In [2,23,3], the internalisation of GCIs is introduced, 
a technique that reduces reasoning w.r.t. a (possibly cyclic) terminology to reasoning 
w.r.t. the empty terminology. For STiXQ, this reduction must be slightly modified. The 
following Lemma shows how general concept inclusion axioms can be internalised us- 
ing a “universal” role U, that is, a transitive super-role of all roles occurring in X and 
their respective inverses. 

Lemma 1. Let C, D be concepts, A an Abox, X a terminology, and TZ a role hierarchy. 
We define 

Ct := I I ~^Ci U Di. 

CiQDi&T 

Let U be a transitive role that does not occur in X, C, D, A, or TZ. We set 

TZu :=TZLI {R C {/, lnv(i?) XU \ R occurs in X, C, D, A, orTZ}. 

- C is satisfiable w.r.t. X and TZ iffC □ Ct H yU.Cr is satisfiable w.r.t. TZu. 

- D subsumes C with respect to X and TZ iffC FI □ Ct □ ^U.Ct is unsatisfiable 
w.r.t. TZu. 

- A is consistent with respect to TZ and X iff A U {a : Cr FI VU.Ct \ a occurs in A} 
is consistent w.r.t. TZu. 

The proof of Lemma 1 is similar to the ones that can be found in [23,2]. Most 
importantly, it must be shown that, (a) if a SFCXQ-concept C is satisfiable with respect 
to a terminology X and a role hierarchy TZ, then C, X have a connected model, i. e., a 
model where any two elements are connect by a role path over those roles occuring in C 
and X, and (b) if y is reachable from x via a role path (possibly involving inverse roles), 
then {x, y) G . These are easy consequences of the semantics and the definition of 

U. 
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Theorem 1. Satisfiability and subsumption of STi.TQ-conceptsw.r.t. terminologies and 
role hierarchies are polynomially reducible to (un)satisfiability of STilQ-concepts 
w.r.t. role hierarchies, and therefore to consistency of SHT Q-Aboxes w.r.t. role hier- 
archies. 

Consistency of STilQ-Aboxes w.r.t. terminologies and role hierarchies is polyno- 
mially reducible to consistency of STCIQ-Aboxes w.r.t. role hierarchies. 



3 A S'HTQ-Abox Tableau Algorithm 

With Theorem 1 , all standard inference problems for Sdil Q-concepts and Aboxes can 
be reduced to Abox-consistency w.r.t. a role hierarchy. In the following, we present a 
tableau-based algorithm that decides consistency of Aboxes w.r.t. role hierar- 

chies, and therefore all other STCIQ inference problems presented. 

The algorithm tries to construct, for a STCIQ- Ahox A, a tableau for A, that is, an 
abstraction of a model of A. Given the notion of a tableau, it is then quite straightfor- 
ward to prove that the algorithm is a decision procedure for Abox consistency. 

3.1 A Tableau for Aboxes 

In the following, if not stated otherwise, C, D denote 57fIQ-concepts, TZ a role hierar- 
chy, A an Abox, R _4 the set of roles occurring in A and TZ together with their inverses, 
and I _4 is the set of individuals occurring in A. 

Without loss of generality, we assume all concepts C occurring in assertions a:C G 
A to be in NNF, that is, negation occurs in front of concept names only. Any STCIQ- 
concept can easily be transformed into an equivalent one in NNF by pushing negations 
inwards using a combination of DeMorgan’s laws and the following equivalences: 

^{3R.C) = (\/R.^C) ^{\/R.C) = (3R.^C) 

-^{^nR.C) = ^(n -f l)R.C -^f^nR.C) = — 1)R.C where 

^(— := An ^A for some A G C 

For a concept C we will denote the NNF of by Next, for a concept C, clos(C') 
is the smallest set that contains C and is closed under sub-concepts and ~. We use 
clos(A) := UaiCe.4 clos(C') for the closure clos(C') of each concept C occurring in A. 
It is not hard to show that the size of clos(A) is polynomial in the size of A. 

Debnition 3. T = (S, G, £, U) iT a tableau /or A w.r.t. TZ iff 

- S is a non-empty set, 

- G : S — > maps each element in S to a set of concepts, 

— £ : R _4 ^2®^® maps each role to a set of pairs of elements in S, and 

— 3 : I _4 — > S maps individuals occurring in A to elements in S. 

Furthermore, for all s,t € S, C, Ci, C 2 G clos(A), and R, S G R^, T satisfies: 

(PI) ifC G G(s), then ^ G(s), 

(P2) if Cl n C 2 G 'C'(s), then C\ G G(s) and C 2 G C{s), 
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(P3) if Cl U C2 G ^{s), then C\ G C{s) or Ci G 'C(s), 

(P4) ifWS.C G C{s) and (s,t) G £(<S'), then C G 'C(f), 

(PS) if3S.C G 'C'(s), f/tm f/tere w iome f G S such that (s, f) G £{S) and C G 
(P6) ifyS.C G £-(s) and (s, f) G E{R) for some R(iS with Trans(i?), then \/R.C G 
L{t), 

(P7) {x,y) G £(i?) iff{y,x) G £(lnv(i?)), 

(P8) if{s,t) G £(i?) and R US', f/tm (s,f) G £(S), 

(P9) if^nS.C G £(s), f/tm t|S^(s, C) < n, 

(PIO) if^nS.C G 'C(s), f/tm [|S^(s, C) ^ n, 

(Pll) (/(cc n S C) G L(s) and (s, f) G £(S) then C G or ~C G 'C'(f), 

(P12) ifa:C G A thenC G L(U(a)), 

(P13) if (a,b) : R G A, then (U(a), 3(6)) G £(i?), 

(P14) if a ^ 6 G A, 3(a) 3(6), 

where cc is a place-holder for both < and and S"^(s,C) := |f G S I (s.t) G 
£(S) awt/CG £(6)}. 



Lemma 2. A STfTQ-Abox A w consistent w.r.t. TZ iff there exists a tableau for A w.rf. 
7^. 

Proof: For the if direction, if T = (S, L, £, 3) is a tableau for A w.r.t. 7Z, a model 
X = (A^, A) of A and 7^ can be defined as follows: 

■= S 

for concept names A in clos(A) : A^ := {s | A G L(s)} 

for individual names a G I : of := 3(a) 

( £.{R)'^ ifTrans(i?) 

for role names i^G7^: ■= < e{R) U U otherwise 

[ PKR,P^R 

where £ (i?) denotes the transitive closure of £ (i?) . The interpretation of non- transitive 
roles is recursive in order to correctly interpret those non-transitive roles that have a 
transitive sub-role. From the definition of R^ and (P8), it follows that, if (s, t) G S^, 
then either (s, t) G £(S) or there exists a path (s, si), (si, S 2 ), . . . , (s„, t) G £(7?) for 
some R with Trans(i?) and i? E S. 

Due to (P8) and by definition of X, we have thatX is a model of TZ. 

To prove that X is a model of A, we show that C G T(s) implies s G for any 
s G S. Together with (P12), (P13), and the interpretation of individuals and roles, this 
implies that X satisfies each assertion in A. This proof can be given by induction on the 
length ||C|| of a concept C in NNF, where we count neither negation nor integers in 
number restrictions. The only interesting case is C = yS.E: let 6 G S with (s, t) G . 
There are two possibilities: 

- (s, f) G £(5'). Then (P4) implies E G E{t). 

- {s,t) ^ £(>S')- Then there exists a path (s, si), (si, S 2 ), . . . , (s„,t) G £(7?) for 
some R with Trans(i?) and Rl±S. Then (P6) implies WR.E G E{si) for all 1 < 
i < n, and (P4) implies E G E{t). 
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In both cases, t G by induction and hence s G . 

For the converse, for I = (Z\^, -^) a model of A w.r.t. TZ, we define a tableau 
T = (S, £, J) for A and TZ as follows: 

S := £{R) := , E{s) := {C G clos(yl) | s e C^}, and 3(a) = . 

It is easy to demonstrate that T is a tableau for D. □ 

3.2 The Tableau Algorithm 

In this section, we present a completion algorithm that tries to construct, for an input 
Abox A and a role hierarchy TZ, a tableau for A w.r.t. TZ. We prove that this algorithm 
constructs a tableau for A and TZ iff there exists a tableau for A and TZ, and thus decides 
consistency of STCIQ Aboxes w.r.t. role hierarchies. 

Since Aboxes might involve several individuals with arbitrary role relationships be- 
tween them, the completion algorithm works on a. forest rather than on a tree, which is 
the basic data structure for those completion algorithms deciding satisfiability of a con- 
cept. Such a forest is a collection of trees whose root nodes correspond to the individuals 
present in the input Abox. In the presence of transitive roles, blocking is employed to 
ensure termination of the algorithm. In the additional presence of inverse roles, blocking 
is dynamic, i.e., blocked nodes (and their sub-branches) can be un-blocked and blocked 
again later. In the additional presence of number restrictions, pairs of nodes are blocked 
rather than single nodes. 

Definition 4. A completion forest TF for a SHI Q Abox A is a collection of trees whose 
distinguished root nodes are possibly connected by edges in an arbitrary way. Moreover, 
each node x is labelled with a set Si {x) C clos(yl) and each edge {x, y) is labelled with 
a set Si{{x, y)) C TZy\ of (possibly inverse) roles occurring in A. Finally, completion 
forests come with an explicit inequality relation ^ on nodes and an explicit equality 
relation = which are implicitly assumed to be symmetric. 

If nodes x and y are connected by an edge {x, y) with R G Si{{x,y)) and RSIS, 
then y is called an S'-successor of x and x is called an I nv(S') -predecessor ofy. Ify is 
an S-successor or an \nv{S)-predecessor ofx, then y is called an S-neighbour ofx. A 
node y is a successor ( resp. predecessor or neighbour) ofy if it is an S-successor ( resp. 
S-predecessor or S-neighbour) of y for some role S. Finally, ancestor is the transitive 
closure of predecessor. 

For a role S, a concept C and a node x in T we define S^{x, C) by 
S^{x,C) := {y I y is S-neighbour of X and C G E(y)}. 

A node is blocked iff it is not a root node and it is either directly or indirectly 
blocked. A node x is directly blocked iff none of its ancestors are blocked, and it has 
ancestors x' , y and y' such that 

1. y is not a root node and 

2. X is a successor ofx' and y is a successor ofy' and 
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3. L{x) = L{y) and L{x') = lCj{y') and 

4. L{{x',x)) = !3{{y',y)). 

In this case we will say that y blocks x. 

A node y is indirectly blocked iff one of its ancestors is blocked, or it is a successor 
of a node x and H>{{x, y)) = 0; the latter condition avoids wasted expansions after an 
application of the ^-rule. 

Given a SHIQ-Abox A and a role hierarchy TZ, the algorithm initialises a comple- 
tion forest consisting only of root nodes. More precisely, Tj( contains a root node 

xhfor each individual at G occurring in A, and an edge (xq, Xq) if A contains an 
assertion (aj, Oj) : R for some R. The labels of these nodes and edges and the relations 
^ and = are initialised as follows: 

H{xl) := {C\ai-.C&A}, 
x()) := {R I (ai, Qj) : R&A}, 

Xq ^ Xq iff Qi ^ Qj G A, and 

the =-relation is initialised to be empty. is then expanded by repeatedly applying 
the rules from Figure 1. 

For a node x, /C(x) is said to contain a clash if, for some concept name A G C, 
{A,^A} C £j{x), or if there is some concept GinS.C G £j{x) and x has n 1 S- 
neighboursyo, . . . ,yn withC G !Z{yi) andyi yjforallO < i < j < n. A completion 
forest is clash-free if none of its nodes contains a clash, and it is complete if no rule 
from Figure 1 can be applied to it. 

Fora SFCIQ-Abox A, the algorithm starts with the completion forest Tj(. It applies 
the expansion rules in Figure I, stopping when a clash occurs, and answers “A is 
consistent w.r.t. TZ” iff the completion rules can be applied in such a way that they 
yield a complete and clash-free completion forest, and “A and is inconsistent w.r.t. TZ ” 
otherwise. 

Since both the ^-rule and the ^^-rule are rather complicated, they deserve some 
more explanation. Both rules deal with the situation where a concept G-nR.C G C(x) 
requires the identification of two i?-neighbours y,z of x that contain C in their labels. 
Of course, y and z may only be identified if y ^ z is not asserted. If these conditions 
are met, then one of the two rules can he applied. The ^-rule deals with the case where 
at least one of the nodes to be identified, namely y, is not a root node, and this can lead 
to one of two possible situations, both shown in Figure 2. The upper situation occurs 
when both y and z are successors of x. In this case, we add the label of y to that of 
z, and the label of the edge {x, y) to the label of the edge {x, z). Finally, z inherits all 
inequalities from y, and T((x, y)) is set to 0, thus blocking y and all its successors. 

The second situation occurs when both y and z are neighbours of x, but z is the 
predecessor of x. Again, T(y) is added to T(z), but in this case the inverse of H{{x, y)) 
is added to T((z, x)), because the edge {x, y) was pointing away from x while (z, x) 
points towards it. Again, z inherits the inequalities from y and IZ{{x, y)) is set to 0. 

The rule handles the identification of two root nodes. An example of the whole 
procedure is given in the lower part of Figure 2. In this case, special care has to be taken 
to preserve the relations introduced into the completion forest due to role assertions in 
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n-rule: 


if 1. Cl n C 2 e £j{x), X is not indirectly blocked, and 
2. {Ci,C2} 

then Cj{x) — > Cj{x) U {Ci, C 2 } 


U-rule: 


if 1. Cl U C 2 e £j{x), X is not indirectly blocked, and 
2. {Ci,C2}ni:(a;) = 0 

then £j{x) — > £j{x) U {C} for some E G {Ci, C 2 } 


3 -rule: 


if 1. 3S.C G E(x), X is not blocked, and 
2. X has no S-neighbour y with C G E{y) 
then create a new node t/ with £.((*, y)) := {S'} and£.(y) := {C} 


V-rule: 


if 1. VS.C G E(x), X is not indirectly blocked, and 
2. there is an S-neighbour y of a; with C ^ E(y) 
then E(y) — > E(y)U {C} 


V+ -rule: 


if 1. VS.C G E(x), X is not indirectly blocked, and 

2. there is some R with Trans(i?) and RES, 

3. there is an R-neighbour y of a; with WR.C ^ E{y) 
then E(y) — >E{y)U {VR.C} 


choose-m\e. 


if 1. (cxi n S C) £ E(x), x is not indirectly blocked, and 
2. there is an S-neighbour y of a; with {C, ~C} n E(y) — 0 
then E(y) — > E(y) U {E} for some E G {C, ~C} 


^-rule: 


if 1. '^nS.C G E(x), X is not blocked, and 

2. there are no n S-neighbours yi, ... ,yn such that C G E{yi) 
and yi yj for 1 < i < j < n 
then create n new nodes y\, . . . ,yn with E{{x, yi)) = {S}, 
E(yi) = {C}, and yi ^ yj foi 1 < i < j < n. 


^-rule: 


if 1. i^nS.C G E(x), X is not indirectly blocked, and 

2. jiS'^(a;, C) > n, there are S-neighbours y,z of x with not y ^ z, 
y is neither a root node nor an ancestor of a, and C G E{y) n E(z), 
then 1. £.( 2 :) — > E(z) U E(y) and 
2. if 2 is an ancestor of x 

then E({z,x)) — > E{{z,x)) U lnv(£.((a;, y))) 
else E((x,z)) — > E((x, z)) U E((x, y)) 

3-E((x,y)) — > 0 

4. Set u ^ z for all u with u y 


^r-rule: 


if 1. ^nS.C G E{x), and 

2. t|S'^(a;, C) > n and there are two S-neighbours y, 2 of a; 
which are both root nodes, C G E(y) n E(z), and not y ^ z 
then 1. £.( 2 ) — > E{z) U E(y) and 

2. For all edges (y, w): 

i. if the edge ( 2 , w) does not exist, create it with E{{z, w)) := 0 

ii. E{{z,w)) — > E{{z,w))U E({y,w)) 

3. For all edges {w, y): 

i. if the edge {w, z) does not exist, create it with E{{w, z)) := 0 

ii. E({w,z)) — > E({w,z))uE{{w,y)) 

4. Set £.(y) := 0 and remove all edges to/from y. 

5. Set z for all u with u ^ y. 

6. Set y = z. 



Fig. 1. The Expansion Rules for 57TXQ-Aboxes. 
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the Abox, and to memorise the identification of root nodes (this will be needed in order 
to construct a tableau from a complete and clash-free completion forest). The rule 
includes some additional steps that deal with these issues. Firstly, as well as adding Cj{y) 
to il(z), the edges (and their respective labels) between y and its neighbours are also 
added to 2. Secondly, L{y) and all edges going from/to y are removed from the forest. 
This will not lead to dangling trees, because all neighbours of y became neighbours of 
2 in the previous step. Finally, the identification of y and z is recorded in the = relation. 

Lemma 3 . Let Abe a ST-CIQ-Abox and TZ a role hierarchy. The completion algorithm 
terminates when started for A and TZ. 

Proof: Let m = t|clos(^), n = |R^|, and nmax := max{n | ^nR.C G clos(^)}. 
Termination is a consequence of the following properties of the expansion rules: 






492 



Ian Horrocks, Ulrike Saltier, and Stephan Tobies 



1 . The expansion rules never remove nodes from the forest. The only rules that remove 

elements from the labels of edges or nodes are the and which sets them 

to 0. If an edge label is set to 0 by the ^-rule, the node below this edge is blocked 
and will remain blocked forever. The only sets the label of a root node x 

to 0, and after this, a;’s label is never changed again since all edges to/from x are 
removed. Since no root nodes are generated, this removal may only happen a finite 
number of times, and the new edges generated by the guarantees that the 

resulting structure is still a completion forest. 

2. Nodes are labelled with subsets of clos(^) and edges with subsets of so there 
are at most 2^™” different possible labellings for a pair of nodes and an edge. 
Therefore, if a path p is of length at least 2^™”, the pair-wise blocking condition 
implies the existence of two nodes x,y on p such that y directly blocks y. Since a 
path on which nodes are blocked cannot become longer, paths are of length at most 

22mn 

3. Only the 3- or the ^-rule generate new nodes, and each generation is triggered 

by a concept of the form 3R.C or ^nR.C in clos(^). Each of these concepts 
triggers the generation of at most rimax successors yp. note that if the or the 
rule subsequently causes T((a;, yi)) to be changed to 0, then x will have some R- 
neighbour^ withT( 2 ) D L{y). This, together with the definition of a clash, implies 
that the rule application which led to the generation of yi will not be repeated. Since 
clos(^) contains a total of at most m 3R.C, the out-degree of the forest is bounded 
by mn^^^n. □ 



Lemma 4. Let Abe a STLIQ-Abox and TZ a role hierarchy. If the expansion rules can 
be applied to A and TZ such that they yield a complete and clash-free completion forest, 
then A has a tableau w.r.t. TZ. 

Proof: Let JF be a complete and clash- free completion forest. The definition of a tableau 
T = (S, T, £, 3) from T works as follows. Intuitively, an individual in S corresponds 
to a path in T from some root node to some node that is not blocked, and which goes 
only via non-root nodes. 

More precisely, a path is a sequence of pairs of nodes of T of the form p = 
For such a path we define Tail(p) := Xn and Tair(p) := x'^. With 
r„|£n±ii, denote the path . . ., The set Paths(jF) is defined induc- 

®n+l ^0 ®n+l 

tively as follows: 

- For root nodes Xq of T, [^] G Paths(jF), and 

Xq J 

- For apathp G Paths(jF) and a node z in JF: 

• if z is a successor of Tail(p) and z is neither blocked nor a root node, then 
[p||] G Paths(J^), or 

• if, for some node w in JF, u is a successor of Tail(p) and z blocks y, then 
bif] G Paths(J^). 

Please note that, since root nodes are never blocked, nor are they blocking other nodes, 
the only place where they occur in a path is in the first place. Moreover, by construction 
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of Paths(iF), if p G Paths(iF), then Tail(p) is not blocked, Tail(p) = Tail'(p) iff 
Tair(p) is not blocked, and L(Tail(p)) = L(Tair(p)). 

We define a tableau T = (S, L, £, 3) as follows: 



S=Paths(J^) 
L(p) = i:(Tail(p)) 



£(i?) = {{p, [pI;^]) G S X S I a;' is an i?-successor of Tail(p)} U 

{([gl^], q) G S X S I a;' is an lnv(i?)-successor of Tail(( 7 )} U 

{([f ]’ [f]) G S X S I a;, y are root nodes, and y is an i?-neighbour of a;} 



3{ai) 



[~] if Xq is a root node in T with £^(xq) ^ 0 

[^] if !^{xq) = 0, a;;^ a root node in T with Cj^x^) ^ 0 and Xq = x^ 

Xq 



Please note that L(a;) = 0 implies that a; is a root node and that there is another root 
node y with £<(y) 7 ^ 0 and x = y. We show that T is a tableau for D. 

- T satisfies (PI) because T is clash-free. 

- (P2) and (P3) are satisfied by T because T is complete. 

- For (P4),letp, q G S with Vi?. C G 3i{p), (p, q) G £(i?). If y = [p|^], then x' is an 
i?-successor ofTail(p) and, due to completeness oflF, C G L{x') = L(a;) = 3i{q). 
lfp= [y|^], then x' is an lnv(i?)-successor of Tail(y) and, due to completeness of 
T,C G L(Tail(( 7 )) = Si{q). If p = [ |] and q = [^] for two root nodes x, x, then y 
is an i?-neighbour of x, and completeness of T yields C G £j{y) = £j{q). (P 6 ) and 
(Pll) hold for similar reasons. 

- For (P5), let 3R.C G '^(p) and Tail(p) = x. Since x is not blocked and T com- 
plete, X has some i?-neighbour y with C G f^{y)- 

• If y is a successor of x, then y can either be a root node or not. 

* If y is not a root node: if y is not blocked, then q := \p\^] G S; if y is 
blocked by some node z, then q := [p\^] G S. 

* If y is a root node: since y is a successor of x, x is also a root node. This 
implies p = [|] and y = [|] G S. 

• a; is an \ m{R)-successor of y, then either 

* P = withTail(y) = y. 

* P = [Q\-p] with Tail(y) = u ^ y. Since x only has one predecessor, u 
is not the predecessor of x. This implies x ^ x' , x blocks x' , and u is 
the predecessor of x' due to the construction of Paths. Together with the 
definition of the blocking condition, this implies £<((u, x')) = T<((y, x)) 
as well as C(u) = T<(y) due to the blocking condition. 

* P = [f ] with X being a root node. Hence y is also a root node and q = [^] . 
In any of these cases, (p, q) G £(£?) and C G T'(y). 

- (P7) holds because of the symmetric definition of the mapping £. 

- (P 8 ) is due to the definition of i?-neighbours and i?-successor. 

- Suppose (P9) were not satisfied. Hence there is some p G S with {^nS.C) G 

L(p) and C) > n. We will show that this implies [|S''^(Tail(p), C) > n, 

contradicting either clash-freeness or completeness of T . Let x := Tail(p) and 
P := S’^{p, C). We distinguish two cases: 
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• P contains only paths of the form [p\^] and Then [|P > n is impossible 

since the function Taif is injective on P: if we assume that there are two distinct 
paths qi,q2 G P and Tair(t7i) = Tair(t72) = y' , then this implies that each qi 
is of the form qi = [p\^] or qi = [^]. From qi ^ q^, we have that qi = [p\^] 
holds for some i G {1,2}. Since root nodes occur only in the beginning of 
paths and qi 7^ q2, we have qi = [p\(yi,y')] and q2 = [p\{y2, y ')]- If y' is not 
blocked, then yi = y' = y2, contradicting qi ^ q2- If t/ blocked in T, then 
both yi and t/2 block y', which implies yi = t/2, again a contradiction. Hence 
Tail' is injective on P and thus t|P = t|Tail'(P). Moreover, for each y' G 
Tail'(P), y' is an ^-successor of x and C G T(y'). This implies {x, C) > 
n. 

• P contains a path q where p = Obviously, P may only contain one 

such path. As in the previous case. Tail' is an injective function on the set 
P' := P\ {y}, each y' G Tail'(P') is an S'-successor of x, and C G T(y') for 
each y' G Tail'(P'). Let z := Tail((7). We distinguish two cases: 

* X = x'. Hence x is not blocked, and thus a; is an InvjS'j-successor of z. 
Since Tail'(P') contains only successors of x we have that z ^ Tail'(P') 
and, by construction, z is an S'-neighbour of x with C G L(z). 

X ^ x' . This implies that x' is blocked by x and that x' is an Inv(S')- 
successor of z. Due to the definition of pairwise-blocking this implies that 
a; is an lnv(S')-successor of some node u with C(u) = L(z). Again, u ^ 
Tail'(P') and, by construction, u is an S'-neighbour of x and C G L(u). 

- For (PIO), let {^nS.C) G L(p). Hence there are n S-neighbours yi, . . . , y„ of 
a; = Tail(p) in tF with C G L(yi). For each y^ there are three possibilities: 

• yi is an S-successor of x and yi is not blocked in P. Then qi := [p| or yi is 
a root node and qi := is in S. 

• yi is an S-successor of x and yi is blocked in F by some node z. Then qi = 
[p\f:] is in S. Since the same z may block several of the yjs, it is indeed nec- 
essary to include yi explicitly into the path to make them distinct. 

• a; is an lnv(S)-successor of yi. There may be at most one such yi if x is not 
a root node. Hence either p = [yi|;^] with Tail(yi) = yi, or p = [^] and 




Hence for each yi there is a different path qi in S with S G L((p, qi)) and C G 
L(yi), and thus t|S^(p, C) ^ n. 

- (P12) is due to the fact that, when the completion algorithm is started for an Abox 

A, the initial completion forest Fa contains, for each individual name Oi occurring 
in A, a root node 4 with C(xq) = {C G clos(Fl) I Qi'-CG A}. The algorithm 
never blocks root individuals, and, for each root node Xq whose label and edges 
are removed by the there is another root node Xq with Xq = Xq and {C G 

clos(Fl) I Qi'.C G A} C Together with the definition of H, this yields (P12). 

(P13) is satisfied for similar reasons. 

- (P14) is satisfied because the does not identify two root nodes Xq, y^ when 

Xq ^ yQ holds. □ 
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Lemma 5. Let Abe a STLIQ-Abox and TZ a role hierarchy. If A has a tableau w.r.t. 
TZ, then the expansion rules can be applied to A and TZ such that they yield a complete 
and clash-free completion forest. 

Proof: Let T = (S,L,£,iJ)bea tableau for A and TZ. We use T to trigger the ap- 
plication of the expansion rules such that they yield a completion forest T that is both 
complete and clash-free. To this purpose, a function tt is used which maps the nodes of 
T to elements of S. The mapping tt is defined as follows: 

- For individuals ai in A, we define 7t(xq) := 0(ai). 

- If 7r(a;) = s is already defined, and a successor y of x was generated for BR.C G 
L(x), then 7r(y) = t for some t G S with C G L{t) and (s, t) G E{R). 

- If 7r(a;) = s is already defined, and successors yi of x were generated for ~^nR.C G 
L(x), then 7 r(j/j) = ti for n distinct L G S with C € L{ti) and (s, ti) G £{R). 

Obviously, the mapping for the initial completion forest for A and TZ satisfies the fol- 
lowing conditions: 

L{x) C L(7t(x)), ) 

if y is an S'-neighbour of x, then (7r(a;), 7r(y)) G £{S), and > (*) 

X y implies 7 r(x) 7 ^ 7 r(y). J 

It can he shown that the following claim holds: 

Claim: Let T be generated by the completion algorithm for A and TZ and let tt satisfy 
(*). If an expansion rule is applicable to T, then this rule can be applied such that it 
yields a completion forest T' and a (possibly extended) tt that satisfy (*). 

As a consequence of this claim, (PI), and (P9), if A and TZ have a tableau, then the 
expansion rules can be applied to A and TZ such that they yield a complete and clash- 
free completion forest. □ 

From Theorem 1, Lemma 2, 3 4, and 5, we thus have the following theorem: 

Theorem 2. The completion algorithm is a decision procedure for the consistency of 
STLTQ-Aboxes and the satisfiability and subumption of concepts with respect to role 
hierarchies and terminologies. 



4 Conclusion 

We have presented an algorithm for deciding the satisfiability of STCIQ KBs where the 
Abox may be non-empty and where the uniqueness of individual names is not assumed 
but can be asserted in the Abox. This algorithm is of particular interest as it can be used 
to decide the problem of conjunctive query containment w.r.t. a schema [17]. 

An implementation of the STCIQ Tbox satisfiability algorithm is already available 
in the FaCT system [14], and is able to reason efficiently with Tboxes derived from real- 
istic ER schemas. This suggests that the algorithm presented here could form the basis 
of a practical decision procedure for the query containment problem. Work is already 
underway to test this conjecture by extending the FaCT system with an implementation 
of the new algorithm. 




496 



Ian Horrocks, Ulrike Saltier, and Stephan Tobies 



References 

1. C. Areces, P. Blackburn, and M. Marx. A road-map on complexity for hybrid logics. In 
Proc. ofCSL’99, number 1683 in LNCS, pages 307-321 Springer- Verlag, 1999. 

2. F. Baader. Augmenting concept languages by transitive closure of roles: An alternative to 
terminological cycles. In Proc. of IJCAI-91 , 1991. 

3. F. Baader, H.-I. Biirckert, B. Nebel, W. Nutt, and G. Smolka. On the expressivity of feature 
logics with negation, functional uncertainty, and sort equations. Journal of Logic, Language 
and Information, 2:1-18, 1993. 

4. F. Baader, H.-J. Heinsohn, B. Hollunder, I. Muller, B. Nebel, W. Nutt, and H.-I. Profitlich. 
Terminological knowledge representation: A proposal for a terminological logic. Technical 
Memo TM-90-04, DFKI, Saarbriicken, Germany, 1991. 

5. P. Blackburn and I. Seligman. What are hybrid languages? In Advances in Modal Logic, 
volume 1, pages 41-62. CSLI Publications, Stanford University, 1998. 

6. M. Buchheit, F. M. Donini, and A. Schaerf. Decidable reasoning in terminological knowl- 
edge representation systems. J. of Artificial Intelligence Research, 1:109-138, 1993. 

7. D. Calvanese. Reasoning with inclusion axioms in description logics: Algorithms and com- 
plexity. In Proc. of ECAI’96, pages 303-307. John Wiley & Sons Ltd., 1996. 

8. D. Calvanese, G. De Giacomo, and M. Lenzerini. On the decidability of query containment 
under constraints. In Proc. of PODS’98, pages 149-158. 1998. 

9. D. Calvanese, G. De Giacomo, M. Lenzerini, D. Nardi, and R. Rosati. Source integration in 
data warehousing. In Proc. of DEXA-98. IEEE Computer Society Press, 1998. 

10. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Daniele Nardi, and Riccardo 
Rosati. Description logic framework for information integration. In Proc. ofKR-98, 1998. 

11. G. De Giacomo and E. Massacci. Combining deduction and model checking into tableaux 
and algorithms for converse-PDL. Information and Computation, 1998. To appear. 

12. Giuseppe De Giacomo and Maurizio Lenzerini. What’s in an aggregate: Foundations for 
description logics with tuples and sets. In Proc. of IJCAI-95, 1995. 

13. V. Haarslev and R. Moller. An empirical evaluation of optimization strategies for abox 
reasoning in expressive description logics. In Lambrix et al. [19], pages 115-119.. 

14. I. Horrocks. FaCT and iFaCT. In Lambrix et al. [19], pages 133-135. 

15. I. Horrocks, A. Rector, andC. Goble. A description logic based schema for the classification 
of medical data. In Proc. of the 3rd Workshop KRDB’96. CEUR, lune 1996. 

16. I. Horrocks and U. Saltier. A description logic with bansitive and inverse roles and role 
hierarchies. Journal of Logic and Computation, 9(3):385-410, 1999. 

17. I. Horrocks, U. Saltier, S. Tessaris, and S. Tobies. Query containment using a DLR ABox. 
LTCS-Report 99-15, LuEG Theoretical Computer Science, RWTH Aachen, Germany, 1999. 

18. I. Horrocks, U. Saltier, and S. Tobies. Practical reasoning for expressive description logics. 
In Proc. ofLPAR’99, number 1705 in LNAI, pages 161-180. Springer- Verlag, 1999. 

19. P. Lambrix, A. Borgida, M. Lenzerini, R. Moller, and P. Patel-Schneider, editors. Proc. of 
the International Workshop on Description Logics (DL’99), 1999. 

20. E. Mays, R. Weida, R. Dionne, M. Laker, B. White, C. Liang, and F. J. Oles. Scalable and 
expressive medical terminologies. In Proc. of the 1996 AM Al Annual Fall Symposium, 1996. 

21. U. Saltier. A concept language extended with different kinds of transitive roles. In 20. 
Deutsche Jahrestagungfur KI, volume 1137 in LNAI. Springer- Verlag, 1996. 

22. A. Schaerf. Reasoning with individuals in concept languages. Data and Knowledge Engi- 
neering, 13(2):141-176, 1994. 

23. K. Schild. A correspondence theory for terminological logics: Preliminary report. In J. 
Mylopoulos, R. Reiter, editors, Proc. of IJCAI-91, Sydney, 1991. 

24. M. Schmidt-SchauB and G. Smolka. Attributive concept descriptions with complements. 
Artificial Intelligence, 48(l):l-26, 1991. 




System Description: Embedding Verification into 

Microsoft Excel* 



Graham Collins^ and Louise A. Dennis^ 

^ Department of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK 
^ Division of Informatics, University of Edinburgh 
Edinburgh EHl IHN, UK 



Abstract. The aim of the Prosper project is to allow the embedding of 
existing verification technology into applications in such a way that the 
theorem proving is hidden, or presented to the end user in a natural way. 
This paper describes a system built to test whether the Prosper toolkit 
satisfied this aim. The system combines the toolkit with Microsoft Excel, 
a popular commercial spreadsheet application. 



1 Introduction 

The Prosper project is researching and developing a toolkit [1] that allows 
an expert to easily and flexibly assemble proof engines from existing tools to 
provide embedded formal reasoning support inside applications. The ultimate 
goal is to make the reasoning and proof support invisible to the end-user — or at 
least, more realistically, to incorporate it securely within the interface and style 
of interaction to which they are already accustomed. Several large case studies 
are taking place within the project to investigate this. 

This paper describes a preliminary case study embedding verification into Mi- 
crosoft Excel without inventing or re-implementing any existing theorem proving 
techniques or mathematical decision procedures. ^ The primary aim was to show 
that the technology is effective when applied to real, standard applications not 
designed by project members. In addition we were interested in investigating a 
“lightweight” theorem proving approach where only a small amount of theorem 
proving functionality is added but it is completely hidden from the user. 

This paper begins with a brief overview of the Prosper toolkit (§2) and 
Excel (§3) followed by a discussion of the system developed. 

2 Extending Applications with Custom Proof Engines 

A central part of Prosper’s vision is the idea of a proof engine — a custom built 
verification engine which can be operated by another program through an Ap- 
plication Programming Interface (API) . A proof engine can be built by a system 

* Work funded ESPRIT Framework IV Grant LTR 26241 
^ The case study is available from http://www.collins-peak.net/p-excel/ 
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developer using the toolkit provided by the project. A proof engine is based upon 
the functionality of a theorem prover with additional capabilities provided by 
‘plugins’ formed from existing, off-the-shelf, tools. The toolkit includes a set of 
libraries based on a language-independent specification, the Prosper Integra- 
tion Interface (PH), for communication between components of a final system. 
The theorem prover’s command language is treated as a kind of scripting or glue 
language for managing plugin components and orchestrating the proofs. 

The PII consists of several parts. There is a datatype for communication of 
data between components of a system which includes the language of higher order 
logic used by the HOL system [2] and so any formula expressible in higher order 
logic can be passed between components. There is support for installing proce- 
dures in an API and calling them remotely. There are also parts for managing 
low level communication, which are largely invisible to an application developer. 
The PII is currently implemented in ML, C, Java, Python, AProlog and ADA. 

Proof engines are constructed on top of a small subset of HOL, called the 
Core Proof Engine. This consists of theorems, inference rules for higher order 
logic and an ML implementation of the PII. A developer can write extensions to 
the Core Proof Engine and place them in an API to form a custom proof engine. 
When incorporating a proof engine into an application the developer calls the 
customised API through the PII. 

3 Microsoft Excel 

Excel is a spreadsheet package marketed by Microsoft [4] . Its basic constituents 
are rows and columns of cells into which either values or formulae may be entered. 
Formulae refer to other cells, which may contain either values or further formulae. 

Users of Excel are likely to have no interest in using or guiding mathematical 
proof, but they do want to know that they have entered formulae correctly. 
They therefore have an interest in ‘sanity checking functions’ that they can use 
to reassure themselves of correctness. This made Excel suited as a case study 
since the users have a notion of formulae and correctness, all that needs to be 
hidden is the proof. Another advantage is that Excel was designed to allow new 
functionality to be added and although its developers were not concerned with 
verification there is support for calling external tools. 

As a simple example, the authors undertook to incorporate a sanity checking 
function into Excel. We chose to implement an equality checking function which 
would take two cells containing formulae and attempt to determine whether 
these formulae were equal for all possible values of the cells to which they refer. 

Simplifying assumptions were made for the case study. The most important 
were that cell values were only natural numbers or booleans and that only a 
small subset of the functions available in Excel (some simple arithmetical and 
logical functions) appeared in formulae. Given these assumptions, less than 150 
lines of code were needed to produce a prototype. This prototype handled only 
a small range of formulae decidable by linear arithmetic or propositional logic 
decision procedures, but it demonstrated the basic functionality. 
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4 Architecture 

The main difficulty in the system was that Excel is Windows based and expects 
Microsoft’s Component Object Model (COM) to be used for communication 
between processes, whereas the Prosper toolkit had been developed for UNIX 
machines^ and uses sockets for communications between components. 

Several possible solutions to this problem were considered including imple- 
menting the PII in Visual Basic and using internet sockets to let Excel commu- 
nicate with a proof engine. We did not take this approach because our aim was 
to show that theorem proving technology can be incorporated into applications 
in as natural a way as possible. For Excel this meant making the functionality 
of the Prosper tools available as a COM server. 

The Prosper COM server was implemented in Python, a dynamically typed, 
object oriented scripting language which supports both COM and sockets. The 
server consists of two parts, the python implementation of the PII and the ad- 
ditional code described below which is specific to this example. 

The remaining decision was where to convert Excel’s formulae, which we 
access as strings, into terms. This requires some type inference but is simple to 
do and could have been written in either the Python or Visual Basic components. 
This was done in Python since it was the preferred language of the authors. 

From the Excel side the Python component is a COM server which makes 
available a small number of functions that Excel can call. The use of a UNIX 
based theorem prover is not visible to Excel. From the proof engine side the 
Python component behaves like any other application calling the proof engine 
using the PII. The use of Excel is not visible to the theorem prover. 

A view of the current (2 operating system) architecture is shown below. 




5 Custom Proof Engine 

The initial custom proof procedure is very simple-minded. It uses a linear arith- 
metic decision procedure provided by HOL and a propositional logic plugin 
(based on Prover Technology’s proof tool [6,5]) to decide the truth of formu- 
lae. While the approach is not especially robust, it is strong enough to handle 
many formulae. 

^ It is expected that a future version will be ported to Windows. 
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The additional code required to create this custom proof procedure is very 
small (approx. 45 extra lines of ML were needed). All the verification code used 
already existed either in HOL or the plugin, the new code concentrated on gluing 
together the decision procedures and deciding which should be used. 

A proof engine which could handle a wider range of formulae would require 
more work. It is possible that more decision procedures could be used to provide 
this, for instance we could exploit HOL’s simplifier. Alternatively it might prove 
necessary to implement some specialised theorem proving algorithms. This would 
also be possible using the Prosper toolkit. 

6 Python COM Component 

The main piece of code developed for this system is the Python implementation 
of the PII. This was simple to write, partly since the structure is similar to the 
existing Java PII, and partly because this is the sort of application for which 
Python was designed. The code makes use of dynamic typing and other features 
of the language to provide a compact and natural implementation of the PII. 
Although written for this one application, the Python implementation makes 
available the objects of the PII, and hence the functionality of the Prosper 
tools to any language that supports COM. 

In addition to the PII implementation the COM component contains some 
additional code specific to this example. This first parses the strings to logical 
terms. This assumes that the semantics of the operators is the same in Excel 
and HOL. The terms are then passed on to the proof engine. It returns the result 
of the proof attempt as true, false, or ‘unable to decide’, which is displayed in 
the cell containing the ISEQUAL formula. This result can be used by other cells 
and will be automatically recomputed if necessary. 



7 Excel Macro 

We wrote a visual basic function, ISEQUAL, using Excel’s macro editor. Once writ- 
ten, it automatically appears in Excel’s function list as a User Defined Function 
and can be used in a spreadsheet like any other function. ISEQUAL takes two cell 
references as arguments. It recursively extracts the formulae contained in the 
cells as strings (support for this already exists in Excel) and passes them on to 
the Python object. The macro consists of about 30 lines of Visual Basic code. 

8 Conclusions 

There are numerous Add-Ins to Excel many of which, unsurprisingly, extend its 
mathematical ability. The Maple 6 Add-In provides computer algebra techniques 
to Excel spreadsheets. Interval Solver [3] extends Excel with Interval Constraint 
Solving to allow spreadsheet users to reason with incomplete and uncertain in- 
formation. We believe that theorem proving could also have a role to play in this 
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field. We have demonstrated that the Prosper approach provides a framework 
in which this could be done. 

We were surprised and pleased with the ease that a very basic prototype 
of verification support for Excel could be produced. It took two programmers, 
neither of whom had any experience with Visual Basic, Python or COM only 
48 hours to get to the point where Excel was able to prove the commutativity 
of plus. While this may seem uninteresting, the reordering of the mathematical 
operators in large formulae is exactly the kind of lightweight sanity check that 
may appeal to users. Extending the system to handle more arithmetic and logical 
operators was easy and the system has been tested on a range of linear arithmetic 
and spreadsheet style examples. The system could be extended further and more 
complex and interesting proof strategies could be programmed. 

The system is a proof of concept of the claim made by the Prosper project 
that their toolkit would enable the embedding of verification into applications 
not designed with it specifically in mind. The only significant piece of new code 
is the Python port of the PII which is a general purpose component that could 
be used for other systems. 

Adding even limited theorem proving functionality by programming a proce- 
dure from scratch instead of using existing tools would have taken much longer, 
as would interfacing to a theorem prover without using the Prosper tools. 

The use of two operating systems is not ideal but could be removed if the 
Prosper tools were ported to Windows. The current setup would be reasonable 
in a networked setting with many copies of Excel accessing one proof engine. 

The embedding of verification into Excel also serves as an example of the 
concepts of “lightweight” theorem proving and the “invisible” use of verification. 
Here all the infrastructure is invisible to the user who simply gets an extra 
function available in Excel. 



References 

1. L. A. Dennis, G. Collins, M. Norrish, R. Boulton, K. Slind, G. Robinson, M. Gordon 
and T. Melham, The PROSPER Toolkit, TAG AS 2000, to appear. 2000. 

2. M. J. C. Gordon and T. F. Melham (eds), Introduction to HOL: A theorem proving 
environment for higher order logic, Cambridge University Press, 1993. 

3. E. Hyvo nen and S. De Pascale, A New Basis for Spreadsheet Computing: Interval 
Solver^^ for Microsoft Excel. Proceedings of 16th National Conference on Artificial 
Intelligence and 11th Innovative Applieations of Artificial Intelligence Conferenee 
(AAAI/IAAI-99), AAAI Press / The MIT Press, pp. 799-806, 1999. 

4. Microsoft Corporation, Microso/t Excel, http : //www.microsoft . com/excel. 

5. M. Sheeran and G. Stalmarck, A tutorial on Stalmarck’s proof procedure for 
propositional logic. The Second International Conference on Formal Methods in 
Computer- Aided Design, Lecture Notes in Computer Science 1522, Springer- Verlag, 
pp. 82-99, 1998. 

6. G. Stalmarck and M. Safiund, Modelling and Verifying Systems and Software in 
Propositional Logic. Proceedings of SAFECOMP ’90, Pergamon Press, pp. 31-36, 
1990. 




System Description: Interactive Proof Critics in 
XBarnacle 



Mike Jackson' and Helen Lowe^ 

' Department of Electronic and Electrical Engineering, University of Edinburgh, Scotland 
Michael Jackson @ ee. ed. ac . uk 

^ Department of Computer Studies, Glasgow Caledonian University 
Cowcaddens Road, Glasgow G4 OBA, Scotland 
H.Lowe@gcal.ac.uk 



1 Introduction 

Proof critics [2] extend the power of a theorem prover by, for example, allowing 
lemmas to be postulated and proved in the course of a proof. However, extending the 
automated theorem prover CLAM by adding critics also increases the search space. 
XBarnacle [5] was developed to make the process of interacting with a semi- 
automatic theorem prover more tractable for the non-expert user. We have now sub- 
stantially amplified and extended XBarnacle so that it makes the work of expert users 
more efficient as they interact with proof critics. Of course, we have also made cos- 
metic improvements to aid navigability and to bridge the gulf of evaluation [1] which 
proves such an obstacle in making theorem provers more accessible, and even their 
expert users more efficient. 



2 System Requirements 

In building the new version of XBarnacle, we were able to build on experience with 
several systems, each with strengths and weaknesses. An obvious meta-requirement 
was to try to incorporate the strengths of each whilst avoiding the weaknesses. 

- Clam version 3.2 could patch proofs by applying critics under certain patterns of 
failure in the pre-conditions of methods. However, this greatly increases the search 
space, making interaction necessary. 

- XClam [3] was a graphical interface to Clam version 3.2 but, like its parent system 
had no persistent representation of the partial plan, making undoing and re-planning 
nodes impossible; it also used non-hierarchical planning, making navigation diffi- 
cult. 

- The version of XBarnacle (based on clam version 2.2) reported in [5] allowed the 
user to interact with the proof tree in the course of a proof but could not patch 
proofs automatically. 
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- Another version of XBarnacle [4] incorporated critics in the “flat” structure of 
Clam 3.2. This made proof trees very rapidly become large and unnavigable in 
practice if not in theory.. Co-incidentally, during the evaluation users requested 
passive as opposed to active critiquing (so that the decision to apply a critic was the 
user’s rather than the program’s). 

The proof engine of the new version of XBarnacle was therefore to be an amalgam of 
CLAM version 2.6, which has no critics but a hierarchical method set; and CLAM 
version 3.2, which incorporates critics in a flat method structure. In addition, we had 
found that a common request on the original XBarnacle system reported in [5] was to 
be able to “open up” the high-level proof steps to see the individual smaller steps 
within. This facility of hierarchical tree browsing has therefore been provided in the 
current system, not merely to help less experienced users see how the steps are per- 
formed, but so that expert users can interact directly with the step requiring the use of 
critics. 



3 Hierarchical Tree Browsing 

With previous versions of XBarnacle only the top level goals were retained and dis- 
played. The intermediate goals arising in sub-plans were not stored. For example, if 
the method used several rewrite rules only the final result of rewriting was shown, not 
the intermediate steps. We made modifications so that all of these steps are retained 
and can be displayed on request. This helps to bridge the gulf of evaluation for users 
who may find that the granularity is too high, and that the system is making too big a 
leap from one step to another. It also facilitates the use of critics. Experienced users 
can open up nodes until they see a likely looking sub-goal for which they know a 
lemma will also certainly be proposed by a critic. 

A data structure in Prolog to facilitate hierarchical tree browsing was provided as a 
set of tree_nodes. The Tcl/Tk side also has a tree_node array. This array is indexed by 
the canvas in which the sub-plan will be drawn if the user requests it. It indicates 
whether any of the (sub-)goals may be critiqued, in other words, whether a critic 
might be applicable. 



4 Critics 

XBarnacle is a co-operative system and allows the user to critique nodes in the proof 
plan. Interactive proof critics will then propose ways in which the proof might be 
improved, for example by the addition of some wave rules or generalising a goal and 
the user is free to accept or reject the proposed patches. All nodes that may be cri- 
tiqued with the loaded critics will be marked by XBarnacle using a distinctive icon. 
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XBarnacle allows the user to critique either a specific goal, in which case critics are 
tested for applicability at that goal only; or an entire sub-plan, in which case every 
goal that is in the sub-plan (and recursively sub-plans of the sub-plan etc.) will be 
critiqued (i.e. the system will test for applicability of critics). The type of critiquing 
supported on a node will be indicated by a shaded mark on the node. The user can then 
indicate that they wish to critique the goal, possibly by using hierarchical tree brows- 
ing, or the sub-plan. 

After having selected whether to critique a goal or a sub-plan XBarnacle will dis- 
play a list of the critics that may be used to critique the goal or sub-plan. Currently 
these are lemma calculation, lemma speculation, generalization, and inductiiob revi- 
sion. The user can select one or more of these critics to try to apply to the goal/sub— 
plan. For example, if the user thinks that the original goal must be generalized before 
the theorem is provable, they will choose generalization. If there seems to be a miss- 
ing lemma, lemma calucation and/or lemma speculation will be chosen. 

Once the critic is selected, XBarnacle will see if the chosen critics can propose 
some patches for the selected goal, or goals in a sub-plan. 

If the user has no idea which to choose, they can simply leave the choice to XBar- 
nacle. Instead of choosing a selected set of critics to apply as described above, press- 
ing the Critique with All Critics button causes XBarnacle to critique the goal/sub-plan 
using all the allowable critics. 

View the Proposed Patches is then facilitated. There will be a short delay as CLAM 
tests whether the critics propose any patches. Critiquing a sub-plan may take a while 
since every goal in the sub-plan hierarchy must be critiqued. 

If any of the chosen critics propose patches for the goal/sub-plan goals being cri- 
tiqued then XBarnacle will display a Proposed Patches window with a list of the pos- 
sible patches, each patch including the name of the critic that proposed that patch.. 

The user can apply a patch or retrieve a number of types of information about a 
patch. Some patches may need to be customized. All of these actions are specific to 
the currently selected patch. 

Clicking on a patch in the list of patches displayed selects that patch. Actions such 
as customization, apply patch, view locations, explanations and view patches as wave 
rules, will now be specific to the selected patch until the user selects another one. 

To apply a selected patch we just press the Apply button and XBarnacle will at- 
tempt to apply the selected patch. There will be a delay as XBarnacle plans the node 
where the patch is to be applied and attempts to apply the patch. If the attempt to apply 
the patch fails then XBarnacle will just apply an applicable method instead. 
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Two types of explanation may be viewed for each patch proposed in the possible 
patches window. All are more suitable for users familiar with proof planning and 
rippling than general or novice users. 

- The Why critic applicable? button displays an explanation as to why the method 
associated with the critic failed, in terms of the pattern of precondition failure of the 
method, and some extra information; 

- The What critic does? button gives a general explanation as to what the critic will 
do to patch the proof. 

Patches intended to be used as wave rules, which, at present, are the patches pro- 
posed by the lemma calculation and lemma speculation critics, may be viewed as 
wave rules, shaded graphically. 

Some patches need to be customized by the user before they can be applied. The 
user is expected to provide some information. On selecting a cnstomizable patch the 
Customize... button in the Proposed Patches window will become active. On pressing 
this bntton XBarnacle will display the Customize window, displaying the patch. The 
cnstomizable parts of the patch (higher order meta-terms) will be displayed so that the 
patch can be customized by editing these to provide the necessary instantiations. The 
instantiations may use any of the variables listed in the Customize window, any of the 
functions loaded into XBarnacle, and any of the standard constants and operators. 
Infix versions of common fnnctions, such as + for plus are also accepted. 

If the user has instantiated all the meta-terms in the patch then they can try to apply 
it. This is done by pressing the Apply button. XBarnacle will first perform checks to 
ensure that the patch contains no uninstantiated meta-terms, there are no unknown 
function or other symbols, the term is syntactically correct and that there are no type 
violations. 



5 Obtaining the System 

The interface side of XBarnacle is written in Tcl 8.0 and Tk 8.0. These may be 
downloaded free from http://www.scriptics.eom/software/8.0.html . The proof engine 
is written in SICSTUS Prolog 3.5, obtainable from http://www.sics.se/isl/sicstus.html . 
XBarnacle is available at http://members.xoom.com/helen lowe/XBarnacle.tar . After 
extracting the files, go to the tk_tcl/make subdirectory and edit the Makefile as de- 
scribed in the file at http://members.xoom.com/helen lowe/readme. tx t. Follow the 
instrnctions in that file to make and run the executable. 

The doc/ sub-directory contains substantial user and developer information. The user 
guide includes a short tutorial. The developer manual contains details of the architec- 
tnre, the interplay between Prolog and tcl/tk, data structures used to hold information 
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to be displayed and facilitate navigation, and the design rationale behind some of the 
design decisions taken. 



6 Further Work 

The system was built for, and evaluated by, expert users. As HCI practitioners, we 
focus strongly on the user and the task for which the system is designed. One maxim is 
“Speak the User’s Language” [6]. The users in this case were all members of, or close 
associates of, the Mathematical Reasoning Group in the Division of Informatics at the 
University of Edinburgh. Their interests were in theorem proving per se. Redesigning 
the system so that, for example, it supports users of proof tools, interested primarily in 
program development, is not just a simple question of rewording explanations, al- 
though that could be easily done. However, much of what we have learned can be 
carried through to other users and other tasks. A hierarchical display of the proof 
seems fairly generally desirable, merely the level and granularity differing between 
novices and experts. There is no conclusive evidence that passive as opposed to active 
critiquing is desired by all users, but these could easily be provided as alternatives, a 
feature customizable by the user. 
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The logical framework LF [HHP93] is a meta-language for specifying formal 
languages and related algorithms. It is typically used to represent programming 
languages, type systems and logics, such as operational semantics, compilers, 
natural deduction, sequent calculi, etc. For a survey on logical frameworks con- 
sult [Pfe99] . LF derives its expressive power from dependent types together with 
higher-order representation techniques which directly support common concepts 
in deductive systems such as variable binding, capture-avoiding substitutions, 
parametric and hypothetical judgments and substitution properties. 

Meta-logical frameworks on the other hand extend logical frameworks by the 
ability to formalize and in many cases to mechanize the task of reasoning about 
those languages and their algorithms. Their purpose is to improve the designers 
productivity by automating tedious and error-prone aspects to their task. 

Twelf is such a meta-logical framework. It extends the logical framework 
LF by a meta-logic AdJ [SchOO]. Twelf is designed as a formal tool to express 
and reason about properties of systems represented in LF while taking advan- 
tage of the expressive power of the underlying logical framework. For example, 
Twelf has been successfully employed to derive various properties of program- 
ming languages, type systems and logics, such as for example type preservation 
and progress of various operational semantics, the consistency of logics, and the 
admissibility of new inference rules. Other results include automatic proofs of 
the Church-Rosser theorem, cut-elimination for various logics, soundness and 
completeness of uniform proof search and resolution. 

Unlike theorem provers that are based on standard principles requiring first- 
order encodings, Twelf derives its deductive power from marrying higher-order 
representation techniques with the technique of inductive reasoning. Therefore, 
in these special domains its automated reasoning power far exceeds that of any 
other theorem prover. 

Twelf is written in Standard ML and runs under SML of New Jersey and 
MLWorks on Unix and Window platforms. The current version is distributed 
with a complete manual [PS98], example suites, a tutorial in the form of on-line 
lecture notes [PfeOO], and an Emacs interface. Source and binary distributions 
are accessible via the Twelf home page http://www.twelf.org. 
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The purpose of this tutorial is to introduce the Automated Deduction com- 
munity to a growing area in which their expertise can be applied to a novel set 
of problems. No knowledge of natural language processing will be assumed. (If 
you have time for some background reading, James Allen’s ‘Natural Language 
Understanding’, 2nd edition, Addison Wesley, 1995 can be recommended). 

In this tutorial I will describe and illustrate some of the areas in which NLU 
interacts with theorem proving, and say what our problems are. My hope is that 
out of this you will get some interesting new problems to work on, and that we 
will eventually get answers to some of our questions. 

The topics to be covered include: 

Semantic Assembly. At some point, all interesting natural language process- 
ing applications need to relate sentences to a meaning representation of some 
kind. In our case, the target representation is usually first order logic, aug- 
mented with some higher order constructs. We choose first order logic be- 
cause that is what it is easiest to mechanise inferences for, but natural lan- 
guage is intrinsically higher order. I will describe some of the typical problems 
that arise in choosing logical representations for English constructs that are 
(a) capable of being derived compositionally from parsed sentences and (b) 
capable of supporting the necessary inferences. Usually this is a balancing 
act between expressiveness and efficiency of inference. What we need from 
you is some guidance about what higher order constructs are ‘safe’ in that 
they can be mechanised with reasonable efficiency, and some lessons in how 
to transform or compile the rest into something that can be used in practical 
systems. 

Underspecified Representations. Unfortunately, sentences usually contain 
context-dependent constructs (pronouns, etc.) whose interpretation will de- 
pend on the circumstances in which the sentence is produced. This means 
that semantic interpretation has to take place in two stages: one relatively 
compositional phase in which the meanings of the words and their syntactic 
configuration are used to build a ‘quasi-logical form’, and a second stage in 
which inferences are made from the context to flesh out the quasi-logical 
form to something that is evaluable independently. We have two problems 
here: firstly, the ‘quasi-logical’ forms only have a semantics indirectly and 
so we need inference mechanisms for non-standard logics, or at least non- 
standard, linguistically oriented representations. Secondly, the context may 
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not be fully represented either and so we often need to make conditional or 
abductive inferences leading to conclusions of the form ‘P if Q holds’. Again, 
many of the constructs we are dealing with are higher order, and typically, 
the search spaces can be very large, j/p^ 

Disambiguation. Most sentences are ambiguous. We can use various meth- 
ods for choosing the most likely reading. Many readings can be eliminated 
because they are inconsistent with the model that has been built up by pre- 
vious sentences. I will describe current applications of model building and 
model checking for this purpose. Our problems here include (a) reasoning 
efficiently with large numbers of axioms (b) efficient reasoning with equality. 
Yet again, many of the constructs we are dealing with are higher order. 

Some web sites which offer demos of NLP systems using theorem provers in 
the ways described above: 

— http://www.coli.uni-sb.de/~bos/doris: Johan Bos’s DORIS system 

— http://www.cs.rochester.edu/research/epilog: Len Schubert’s 
EPILOG demo 

— http://ubatuba.ccl.umist.ac.uk: Allan Ramsay’s PARASITE system 




Tutorial: Using TPS for Higher-Order Theorem 
Proving and ETPS for Teaching Logic 
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TPS is an automated theorem proving system which can be used to prove 
theorems of first- or higher-order logic automatically, interactively, or in a com- 
bination of these modes of operation. Proofs in TPS are presented in natural 
deduction style. ETPS is a program which was obtained from TPS by deleting 
all the facilities for proving theorems automatically. ETPS can be used by stu- 
dents to learn how to prove theorems interactively. The objective of the tutorial 
is to teach participants how to make effective use of TPS and ETPS. 

Information about TPS, including manuals and information about obtaining 
the system, can be found at http://gtps.matli.cmu.edu/tps.html. 

ETPS is intended to be used as a teaching tool in logic courses. ETPS can be 
used effectively in a course which is concerned purely with first-order logic as well 
as one which also deals with higher-order logic. ETPS gives students immediate 
feedback for both correct and incorrect actions, and makes it easy to display 
selected parts of proofs, as well as modify and rearrange them. Proofs, and the 
active lines of the proof, are displayed in proof windows which are automatically 
updated as the proof is constructed interactively. ETPS enables students to 
construct rigorous proofs of more difficult theorems than they otherwise find 
tractable. ETPS checks proofs automatically, and creates records of the theorems 
proved by each student which can be automatically transferred to the teacher’s 
grade file. 

The logical language of TPS is Church’s type theory, a formulation of higher- 
order logic in which theorems of mathematics can be expressed very naturally. 
The notation of this language is displayed on the screen and in printed proofs. 
Definitions are handled elegantly by A-notation. The tutorial presupposes famil- 
iarity with first-order logic, but not with higher-order logic. The tutorial includes 
an introduction to the notation of type theory, examples showing how to express 
theorems of mathematics (including those involving inductive definitions) in this 
language, and lessons on how to write theorems and definitions in TPS and put 
them into a TPS library. 

The facilities for constructing natural deduction proofs and an editor for wffs 
are common to TPS and ETPS. In addition, TPS has tactics for applying natural 
deduction rules of inference semi-automatically, and automatic procedures for 
constructing complete proofs or filling in gaps in partially completed natural 

* The development of TPS and ETPS was supported by the National Science Foun- 
dation under grant CCR-9732312 and previous grants. 
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deduction proofs. TPS searches for proofs in automatic mode by first searching 
for an expansion proof, and then translating this into a natural deduction proof. 
TPS has a number of search procedures, and there are many flags which control 
the behavior of TPS and set bounds for the many dimensions of proof search in 
higher-order logic. 

TPS is designed to be a research tool as well as a theorem proving system. 
It has facilities for working on unification problems and mating searches, dis- 
playing wffs in vertical path diagrams, printing proofs in various styles including 
tex, and translating back and forth between natural deduction proofs and expan- 
sion proofs. TPS has library facilities, online help, and extensive documentation 
(some of which is produced automatically) . 

The tutorial provides opportunities for hands-on experience with TPS and 
ETPS, and discussion of how to treat examples provided by the participants. 
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Workshop: Model Computation — Principles, 
Algorithms, Applications 
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Computing models of first-order or propositional logic specifications is the com- 
plementary problem of refutational theorem proving. A deduction system capa- 
ble of producing models significantly extends the functionality of purely refu- 
tational systems by providing the user with useful information in case that no 
refutation exists. 

Ideally, any theorem prover that terminates without finding a refutational 
proof should be able to output (information on) countermodels. Characterizing 
classes of inputs for which termination can be guaranteed, defining appropriate 
formalisms for representing such models, and providing algorithms for working 
with the resulting model representations (e.g., evaluating clauses, testing equiv- 
alence, etc.) is a great challenge in automated deduction. 

Computing models is becoming an increasingly important topic in automated 
deduction. This is due to the potential application areas like disproving conjec- 
tures in classical theorem proving and software verification; discourse representa- 
tion in natural language, deductive databases, product configuration, hardware 
verification, model-based diagnosis, planning, model checking etc. 

Some of these methods currently rely heavily on first-order logic with finite- 
domain or propositional logic (e.g. model checking). On the other side, methods 
for computing models for first-order specifications have been emerging recently 
by linking fields like term rewriting, term schematizations, and constraint eval- 
uation and their potential is worth to be explored much further. 

The workshop is therefore emphasizing model construction principles for the 
first-order case, although also highly welcoming contributions concentrating on 
finite-domain, propositional logics, and more expressive logics such as higher- 
order and modal logics. More specifically, the goal of the workshop is to discuss 
(non-exclusively) research on the following issues: 

— Theoretical background, such as representation formalisms for models and 
their properties (like expressivity and complexity). 

— Calculi and respective procedures to compute models, implementations, ex- 
periments and performance issues. 

— Applications and related topics, such as finding the appropriate formulation, 
application problems, and problem sets. 

The goal of the workshop is to bring together researchers working on these and 
related topics. As an outcome, the workshop would help to identify important 
problems in model computation, concentrate our efforts coming from different 
directions to attack these problems, get new insights by mutually learning from 
the various aspects of model computation, and stimulate further research. 
Workshop home page: http://www.uni-koblenz.de/~peter/CADE17-WS-M0DELS/ 



D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 513-513, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Workshop: Automation of Proof by 
Mathematical Induction 



Carsten Schiirmann 

Carnegie Mellon University, Pittsburgh, USA 

Mathematical induction is required for reasoning about objects or events con- 
taining repetition, e.g. computer programs with recursion or iteration, electronic 
circuits with feedback loops or parameterized components, and properties that 
hold for all time forward. It is thus a vital ingredient of formal methods tech- 
niques for synthesizing, verifying and transforming software and hardware. The 
automation of proof by induction strengthens the capabilities of mechanical as- 
sistants, it reduces the need for designers to be skilled in mathematical proof 
techniques, and it improves productivity by automating tedious and error-prone 
aspects of formal system development. This workshop is organized around four 
sessions. 

Inductive Theorem Proving and Formal Methods: Formal system development is 
becoming a mature and established discipline and induction is one of the key 
techniques for dealing with abstract concepts. The aim of this session is to bring 
together the merits of inductive theorem proving techniques and formal methods 
in industrial application scenarios. 

Higher-Order Inductive Theorem Proving: Higher-order logics provide a rich 
framework for expressing and reasoning about formal specifications. The im- 
portance of mechanizing formal arguments within higher-order logics is reflected 
by the sustained growth in popularity of verification environments such as HOL, 
Isabelle, Nuprl, and PVS. The aim of this session is to discuss recent advances 
of automated reasoning techniques within the context of higher-order logics. 
Integrating Inductive and High-Performance Theorem Provers: Many first-order 
theorem provers are based on tableaux, matrix, and resolution techniques and 
their implementations are in general very efficient and highly specialized. The 
aim of this session is to elaborate how to integrate inductive theorem provers 
with other existing theorem proving technology. 

Meeting the Challenges: We are interested in problems which demonstrate the 
unique merits of inductive theorem proving techniques. Submitted challenge 
problems will be displayed on the homepage prior to the workshop. Researchers 
are invited to submit solutions or counter challenges. The aim of this workshop 
session is to debate the relative merits of challenges and their solutions. 

The workshop homepage is located at www.cs.cmu.edu/~carsten/apmiOO and 
the workshop committee consists of Carsten Schiirmann, Andrew Ireland, Deepak 
Kapur, Christoph Kreitz, and Toby Walsh. 
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Workshop: Type-Theoretic Languages: 
Proof-Search and Semantics 



Didier Galmiche 
LORIA - UHP, Nancy, France 

Much recent work has been devoted to type theory and its applications to proof- 
and program- development in various logical settings. The focus of this workshop 
is on proof-search, with a specific interest on semantic aspects of, and semantics 
approaches to, type-theoretic languages and their underlying logics (e.g., classi- 
cal, intuitionistic, linear, substructural) . Such languages can be seen as logical 
frameworks for representing proofs and in some cases formalize connections be- 
tween proofs and programs that support program-synthesis. 

The theory of proof-search has developed mostly along proof-theoretic lines but 
using many type-theoretic techniques. The utility of type-theoretic methods sug- 
gests that semantic methods of the kind found to be valuable in the semantics 
of programming languages should be useful in tackling the main outstanding 
difficulty in the theory of proof-search, i.e., the representation of intermediate 
stages in the search for a proof. The objective of the workshop is to provide 
a forum for discussion between, on the one hand, researchers interested in all 
aspects of proof-search in type theory, logical frameworks and their underlying 
(e.g., classical, intuitionistic, substructural) logics and, on the other, researchers 
interested in the semantics of computation. 

Topics of interest, in this context, include but are not restricted to the following: 
Foundations of proof-search in type-theoretic languages (sequent calculi, natural 
deduction, logical frameworks, etc.); Systems, methods and techniques related 
to proof construction or to counter-models generation (tableaux, matrix, reso- 
lution, semantic techniques, proof plans, etc.); Decision procedures, strategies, 
complexity results; Logic programming as search-based computation, integration 
of model-theoretic semantics, semantics foundations for search spaces; Compu- 
tational models based on structures as games and realizability; Proof synthesis 
vs program synthesis and applications, equational theories and rewriting; Ap- 
plications of proof-theoretic and semantics techniques to the design and imple- 
mentation of theorem provers. 

Programme Committee: 

D. Galmiche, LORIA - UHP, Nancy, France. 

P. Lincoln, SRI, Stanford, U.S.A. 

F. Pfenning, CMU, Pittsburgh, U.S.A. 

D. Pym, Queen Mary and Westfield College, London, U.K. 

J. Smith, Chalmers University, Coteborg, Sweden. 
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Workshop: Automated Deduction in Education 



Erica Melis 

Universitat des Saarlandes, FB Informatik, Germany 

One of the potential real-world applications of deduction systems is in math- 
ematics education. Patrick Suppes’ education system is an early pioneer in this 
regard, for example, and while the potential has been mentioned in discussions 
at previous CADE conferences, currently there is renewed interest in this topic 
as well as several activities and projects within the CADE community. 

In an intelligent tutor system a deduction component might be used, e.g. to 
provide the expert model, to provide potential models of erroneous reasoning, 
as a basis for topic sequencing, as a basis for automated diagnosis. However, 
typically, a mathematics education system will not or not only include a deduc- 
tion system, because the need for explanation will dominate the requirements 
for correctness in theorem proving in an educational context. That is, the power 
of automated deduction has to be combined with appropriate interfaces, user 
models, theory construction, and explanation functionalities before a system 
can be didactively effective. Though extensive production-quality systems are 
still in the future, some of the knowledge and the knowledge representation that 
is currently used in automated and interactive theorem-proving systems can be 
employed for educational needs as well. 

A purpose of this workshop, in this application area of automated and inter- 
active theorem proving, is to establish more communication between current ed- 
ucation projects in the CADE community, to exchange ideas and opinions, and to 
make available the experience of education systems from other Al-communities. 
We explicitly encourage the submission of project descriptions. 

We plan to focus the workshop on the following topics and questions 

— How best can automated and interactive theorem-proving systems contribute 
to mathematics education? 

— What are the proof-presentation and explanation needs for such teaching? 

— What sort of integration of specialized reasoning systems (e.g., computer 
algebra systems) should we expect? 

— How do we generate good examples and counter examples in various sub- 
jects? 

— What is the role of knowledge-based theorem proving for mathematics edu- 
cation? 

— What are the human-factors requirements for good systems? 

— How do we evaluate the educational success of such systems? 

Further information can be available at 

http : / / WWW . ags . uni-sb . de/~melis/ cadeOOws . htm 
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Workshop: The Role of Automated Deduction in 

Mathematics 



Simon Colton, Volker Sorge, and Ursula Martin 

The purpose of this workshop is to discuss the role of automated deduction in 
all areas of mathematics. This will include looking at the interaction between au- 
tomated deduction programs and other computational systems which have been 
developed over recent years to automate different areas of mathematical activ- 
ity. Such systems include computer algebra packages, tutoring programs, math- 
ematical discovery systems and systems developed to help present and archive 
mathematical theories. The workshop will also include discussions of the use of 
automated theorem proving in the wider mathematical community. Presenta- 
tions which detail the employment of automated deduction techniques in any 
area of mathematical research have been encouraged. 

With initiatives such as the Calculemus project, automated deduction is in- 
creasingly being seen not as an isolated area of research, but as part of an 
integrated attack on the problem of automating mathematics. We are interested 
in the interaction of automated theorem proving programs with (i) computer 
algebra (CA) packages (ii) constraint solvers (iii) model generators (iv) tutor- 
ing systems (v) interactive textbooks (vi) theory formation programs and (vi) 
mathematical databases. In all these fields automated deduction is either al- 
ready used or could be fruitfully employed to enhance the power and reliability 
of existing systems. Particular ongoing projects include the use of deduction to 
certify CA systems, and also to enhance CA systems. Other projects include the 
incorporation of deduction into mathematical tutoring systems and interactive 
mathematical textbooks and the use of theory formation to help in automated 
theorem proving. The interaction between these programs could be in terms of 
improving automated deduction or in terms of using automated deduction to 
improve the techniques employed in the other system. 

The workshop is intended to inspire the use of automated deduction within 
other fields of mathematics as well as the incorporation of techniques from other 
fields into automated deduction. We intend to provide a forum for discussion 
between researchers from the field of automated deduction and researchers from 
particular domains of mathematics. In particular, the workshop will address 
mathematical results proved in part by automated deduction techniques as well 
as theorems which can potentially be proved with automated techniques. An 
original goal of automated theorem proving was its application to mathemat- 
ics, whether by proving established results, enhancing calculation techniques 
or facilitating discovery of new results. There is still much scope for the use 
of automated deduction to add to mathematics and we hope to explore these 
possibilities in the workshop. 

The workshop home page is located at: 

http : //www. dai . ed. ac.uk/~simonco/ conf erences/CADEOO 
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